Methods and Devices for Nucleic Acid Synthesis

ABSTRACT

Methods and apparatus relate to the synthesis of polynucleotides having a predefined sequence on a support. Assembly methods include primer extension to generate overlapping construction oligonucleotides and assembly of the polynucleotides of interest onto an anchor support-bound oligonucleotides. Methods and apparatus for selection of polynucleotides having the predefined sequence and/or length are disclosed.

RELATED APPLICATIONS

This application is a continuation application of U.S. application Ser.No. 13/884,463, filed as a National Phase of International ApplicationNo. PCT/US2011/060243 on Nov. 10, 2011, which claims priority to and thebenefit of U.S. Provisional Application No. 61/412,937, filed Nov. 12,2010, U.S. Provisional Application No. 61/418,095, filed Nov. 30, 2010,U.S. Provisional Application No. 61/466,814, filed Mar. 23, 2011, andU.S. Provisional Application No. 61/503,722, filed Jul. 1, 2011. Thecontents of each of the foregoing applications are incorporated hereinby reference in their entirety.

FIELD OF THE INVENTION

Methods and apparatuses provided herein relate to the synthesis andassembly nucleic acids and nucleic acid libraries having a predefinedsequence. More particularly, methods and apparatuses are provided forthe synthesis of target polynucleotides on a solid support and for theselection of the target polynucleotides.

BACKGROUND

Using the techniques of recombinant DNA chemistry, it is now common forDNA sequences to be replicated and amplified from nature and thendisassembled into component parts. As component parts, the sequences arethen recombined or reassembled into new DNA sequences. However, relianceon naturally available sequences significantly limits the possibilitiesthat may be explored by researchers. While it is now possible for shortDNA sequences to be directly synthesized from individual nucleosides, ithas been generally impractical to directly construct large segments orassemblies of polynucleotides, i.e., polynucleotide sequences longerthan about 400 base pairs.

Oligonucleotide synthesis can be performed through massively parallelcustom syntheses on microchips (Zhou et al. (2004) Nucleic Acids Res.32:5409; Fodor et al. (1991) Science 251:767). However, currentmicrochips have very low surface areas and hence only small amounts ofoligonucleotides can be produced. When released into solution, theoligonucleotides are present at picomolar or lower concentrations persequence, concentrations that are insufficiently high to drivebimolecular priming reactions efficiently. Current methods forassembling nucleic acids require that oligonucleotides from microchipsto be amplified prior to assembly. As such, a need remains for improved,a cost-effective, methods and devices for high-fidelity gene assemblyand the production of large number of polynucleotides sequences.

SUMMARY

Aspects of the invention relate to methods and apparatuses for preparingand/or assembling high fidelity polymers. Also provided herein aredevices and methods for processing nucleic acid assembly reactions andassembling nucleic acids. It is an object of this invention to providepractical, economical methods of synthesizing custom nucleic acids.

Aspects of the invention relate to methods and devices for producing apolynucleotide having a predetermined sequence on a solid support. Insome embodiments, pluralities of support-bound single-strandedoligonucleotides are provided at different features of a solid support,each plurality of oligonucleotides having a predefined sequence, eachplurality being bound to a different discrete feature of the support. Insome embodiments, each plurality of oligonucleotides comprises asequence region at its 3′ end that is the complementary to a sequenceregion of a 3′ end of another oligonucleotide and wherein the firstplurality of oligonucleotides has a 5′ end that is complementary to a 5′end of a first anchor single-stranded oligonucleotide. In someembodiments, the plurality of support-bound oligonucleotides isimmobilized on the support. In some embodiments, the plurality ofsupport-bound oligonucleotides is synthesized on the solid support. Inother embodiments, the plurality of support-bound oligonucleotides isspotted on the solid support. In some embodiments, the support is amicroarray device

According to some embodiments, at least a first and a second pluralityof support-bound single-stranded oligonucleotides are provided, whereineach first and second plurality of oligonucleotides has a predefinedsequence and is bound to a discrete feature of the support. In someembodiments, each first plurality of oligonucleotides comprises asequence region at its 3′ end that is complementary to a sequence regionof a 3′ end of the second plurality of oligonucleotides. In someembodiments, a plurality support-bound anchor single-strandedoligonucleotides are provided, wherein the 5′ end of the plurality ofthe first anchor oligonucleotide is the same as a sequence region of thefirst plurality of support-bound oligonucleotides. At least a first anda second pluralities of construction oligonucleotides complementary tothe first and second pluralities of support-bound oligonucleotides aregenerated in a chain extension reaction. The constructionoligonucleotides can be hybridized to the plurality of anchoroligonucleotides at a selected feature. The at least first and secondpluralities of construction oligonucleotides are ligated, therebygenerating the at least one polynucleotide having a predefined sequence.In some embodiments, the at least first and second pluralities ofconstruction oligonucleotides are dissociated from the at least firstand second pluralities of support-bound oligonucleotides. In someembodiments, the first plurality of construction oligonucleotides istransferred from a first feature to a selected feature and the secondplurality of construction oligonucleotides is transferred from a secondfeature to the selected feature, wherein the selected feature comprisesa plurality support-bound anchor single-stranded oligonucleotides. Insome embodiments, the selected feature is on the same support than thefirst and the second features. Yet in other embodiments, the selectedfeature is on a different support than the first and second feature. Insome embodiments, a third plurality of predefined support-boundsingle-stranded oligonucleotides is provided, wherein each thirdplurality of oligonucleotides has a predefined sequence and is bound toa third discrete feature of the support, each third plurality ofoligonucleotides comprising a sequence region at its 3′ end that iscomplementary to a sequence region of a 3′ end of the second pluralityof oligonucleotides. The third plurality of constructionoligonucleotides complementary to the third plurality of support-boundoligonucleotides is generated in a chain extension reaction using thesingle stranded oligonucleotides as templates. The first, second andthird pluralities of construction oligonucleotides are hybridized to theplurality of anchor oligonucleotides at a selected feature and ligatedto produce a longer polynucleotide. In some embodiments, each pluralityof construction oligonucleotides are generated on a different support.In some embodiments, each plurality of support-bound oligonucleotideshas a primer binding site at its 3′ end. The primer binding site can bea universal primer binding site. In some embodiments, the methodcomprises annealing a primer to the at least first and secondpluralities of support-bound oligonucleotides under conditions promotingprimer extension, thereby forming extension product duplexes. In someembodiments, the primer sequence comprises at least one Uracil. In someembodiments, the primer containing Uracil is removed using a mixture ofUracil DNA glycosylase (UDG) and a DNA glycosylase-lyase EndonucleaseVIII.

In some embodiments, the method comprises providing N pluralities ofpredefined support-bound single-stranded oligonucleotides wherein thefirst plurality of oligonucleotides comprises at its 3′ end a sequenceregion that is complementary to a sequence region at the 3′ end of asecond oligonucleotide, wherein the N plurality of oligonucleotidescomprises at its 3′ end a sequence region complementary to a sequenceregion of the (N-1) oligonucleotide; and providing a first plurality ofanchor oligonucleotides comprising at its 5′ end a sequence that is thesame as a sequence region of the first plurality of support boundoligonucleotides. In some embodiments, N pluralities of constructionoligonucleotides complementary to the support-bound single-strandedoligonucleotides are generated, the pluralities of constructionoligonucleotides spanning the entire sequence of the polynucleotidewithout gaps. In some embodiments, the 3′ end sequence region of thefirst plurality of support-bound oligonucleotides is identical to the 5′end region of the anchor oligonucleotides. In some embodiments, theextension products are dissociated thereby releasing the at least firstand second pluralities of construction oligonucleotides.

Aspects of the invention relate to a method of synthesizing andselecting a polynucleotide having a predefined sequence. The methodcomprises synthesizing a plurality of support-bound double-strandedpolynucleotides comprising a free single-stranded overhang, theplurality of polynucleotide sequences comprising the predefinedpolynucleotide sequence, wherein the single-stranded overhang comprisesthe sequence of a terminal construction oligonucleotide N. In someembodiments, a stem-loop oligonucleotide is provided wherein thestem-loop oligonucleotide comprises a single-stranded overhang andwherein the single-stranded overhang is complementary to the terminalconstruction oligonucleotide sequence N. The stem-loop oligonucleotideis hybridized and ligated to the free overhang of the polynucleotidehaving predefined sequence thereby protecting the overhang comprisingthe terminal oligonucleotide N. In some embodiments, polynucleotidesequences that do not comprise the terminal construction oligonucleotidesequence N are degraded using a single-strand exonuclease such as asingle-strand-specific 3′ exonuclease, a single strand-specificendonuclease, and a single strand-specific 5′ exonuclease. In someembodiments, the methods comprise hybridizing a pool of oligonucleotidesto an anchor support-bound single-stranded oligonucleotide, theoligonucleotide pool comprising N pluralities of oligonucleotideswherein the first plurality of oligonucleotides comprises at its 5′ enda sequence region that is complementary to a sequence region at the 5′end of the anchor oligonucleotide, and wherein a N plurality ofoligonucleotides comprises at its 3′ end a sequence complementary to asequence region of the (N-1) oligonucleotide. In some embodiments, thestem-loop oligonucleotide comprises a type II restriction site and thestem-loop oligonucleotide is removed using a type II restrictionendonuclease. In some embodiments, the stem-loop oligonucleotidecomprises at least one Uracil nucleotide and the stem-loopoligonucleotide is removed using a mixture of Uracil DNA glycosylase(UDG) and a DNA glycosylase-lyase Endonuclease VIII. In someembodiments, the anchor oligonucleotide and the polynucleotides arereleased from the support using a mixture of Uracil DNA glycosylase(UDG) and a DNA glycosylase-lyase Endonuclease VIII. In someembodiments, the polynucleotides are released from the support forexample using a Type II restriction enzyme. In some embodiments, thepredefined polynucleotide sequence is amplified.

In some aspects of the invention, methods for synthesizing apolynucleotide having a predefined sequence and selecting the predefinedpolynucleotide sequence according to the its sequence and its length areprovided. In some embodiments, a support comprising (i) a firstplurality of support-bound anchor oligonucleotides, wherein the 5′ endof the first plurality of anchor oligonucleotide is complementary to the5′ end of a first plurality of oligonucleotides and (ii) a secondplurality of support-bound anchor oligonucleotides wherein the 5′ end ofthe second anchor oligonucleotide is complementary to a terminalconstruction oligonucleotide N, is provided. In some embodiments, aplurality of support-bound double-stranded polynucleotides comprising a5′ single-stranded overhang are synthesized. The plurality ofpolynucleotide sequences comprises the predefined polynucleotidesequence, wherein the single-stranded 5′ overhang of the predefinedpolynucleotide sequence comprises the terminal constructionoligonucleotide N sequence and the single-stranded 3′ end of thepolynucleotide sequence comprises the first oligonucleotide sequence.The plurality of synthesized polynucleotides are hybridized, underhybridizing conditions, to the first plurality of anchoroligonucleotides. In some embodiments, the synthesized polynucleotidesare subjected to hybridization conditions, such as the terminaloligonucleotide N hybridized to the 5′ end of the second plurality ofanchor oligonucleotides, thereby selecting the polynucleotides havingthe predefined sequence using the second anchor oligonucleotide. In someembodiments, the polynucleotide sequences having a free 3′ or 5′ end aredegraded using a single-strand specific exonuclease. In someembodiments, the polynucleotides having the predefined sequence arefurther released from the support, for example using a Type IIendonuclease or using a mixture of Uracil DNA glycosylase (UDG) and aDNA glycosylase-lyase Endonuclease VIII. In preferred embodiments, thefirst plurality of anchor oligonucleotides is separated from the secondplurality of anchor oligonucleotides by a distance corresponding to thelength of the predefined polynucleotide. The support can comprisesupport-bound spacer single-stranded oligonucleotides to set thedistance between the first and second anchor oligonucleotides. In someembodiments, the distance between the first and second anchoroligonucleotides is a function of a concentration of the first andsecond anchor oligonucleotides and of the concentration of the spaceroligonucleotide.

Some aspects of the invention relate to a nucleic acid array comprising(a) a solid support; (b) a plurality of discrete features associatedwith the solid support wherein each feature comprises a plurality ofsupport-bound oligonucleotides having a predefined sequence, wherein thefirst plurality of oligonucleotides comprises at its 5′ end a sequenceregion that is complementary to a sequence region at the 5′ end of asecond oligonucleotide, wherein a plurality of oligonucleotides Ncomprises at its 5′ end a sequence complementary to a 5′ end sequenceregion of a plurality of oligonucleotides (N-1); and (c) at least afirst plurality of anchor oligonucleotides comprising at its 5′ end asequence that is identical to a sequence region of the first pluralityof support-bound oligonucleotides. In some embodiments, the nucleic acidarray further comprises a second plurality of support-bound anchoroligonucleotides wherein the 5′ end of the second anchor oligonucleotideis identical to the 5′ end of the plurality of oligonucleotides N. Insome embodiments, the nucleic acid further comprises a plurality ofsupport-bound oligonucleotides having a sequence that is not identicalto the plurality of the plurality of support bound oligonucleotides.

Aspects of the invention relate to a parallel and sequential process forthe production of a plurality of polynucleotides having a predefinedsequence on a support. In some embodiments, a first and second supportshaving a plurality of features are provided, wherein each feature oneach support comprises a plurality of different support-boundoligonucleotides having a different predefined sequence. A first andsecond pluralities of different construction oligonucleotides havingdifferent predefined sequence are generated using the plurality ofsupport-bound oligonucleotides as templates, the first and secondpluralities of construction oligonucleotides having at their 3′ endcomplementary sequences. In some embodiments, a support comprising aplurality of features, wherein each feature comprises a pluralitysupport-bound anchor single-stranded oligonucleotides is provided. Insome embodiments, the 5′ end of each of the plurality of the anchoroligonucleotides is complementary to the 5′ end the first plurality ofconstruction oligonucleotides. The first plurality of constructionoligonucleotides can be hybridized to the anchor oligonucleotidesforming a first plurality of duplexes having a 3′ overhang. The secondplurality of construction oligonucleotides can then hybridize to thefirst plurality of duplexes through the 3′ overhang, thereby forming aplurality of duplexes with a 5′ overhang. Optionally, depending on thelength of the polynucleotide(s) to be synthesized, a third plurality ofconstruction oligonucleotides is hybridized to the second plurality ofconstruction oligonucleotides through the 5′ overhang. The pluralitiesof construction oligonucleotides may be ligated to form thedouble-stranded polynucleotides. In some embodiments, the step ofgenerating the plurality of first construction oligonucleotidescomprises annealing a primer sequence having at least one uracil to thefirst plurality of support-bound oligonucleotides under conditionspromoting extension of the primer and removing the primer using amixture of Uracil DNA glycosylase (UDG) and a DNA glycosylase-lyaseEndonuclease VIII. In some embodiments, the pluralities of constructionoligonucleotides defining each of the polynucleotides are synthesized ona different support. The plurality of different polynucleotides can beassembled at a different feature of support comprising the support-boundanchor oligonucleotides.

Aspects of the invention relate to methods and devices for synthesizinga plurality of polynucleotides having a predefined sequence. In someembodiments, the method comprises the steps of (a) providing a firstsupport comprising a plurality of features, wherein each featurecomprises a plurality support-bound anchor single-strandedoligonucleotides, wherein the 5′ end of each of the plurality of theanchor oligonucleotides is complementary to the 5′ end a first pluralityof construction oligonucleotides; (b) providing a second support havinga plurality of features, wherein each feature comprises a plurality ofsupport-bound oligonucleotides, each plurality of support-boundoligonucleotides having a different predefined sequence; (c) generatinga first plurality of construction oligonucleotides having differentpredefined sequences using the plurality of support-boundoligonucleotides as templates; (d) positioning the first and the secondsupports such as each feature of the second support is aligned to acorresponding feature of the first support; (e) releasing the firstplurality of construction oligonucleotides in solution under conditionspromoting hybridization of the first plurality of oligonucleotides toplurality of anchor oligonucleotides; and (f) optionally repeating stepsb-e with a third support comprising a second plurality of constructionoligonucleotides, the second and the third pluralities of constructionoligonucleotides having 3′ end complementary sequences. In someembodiments, the second support is positioned above and facing the firstsupport. In some embodiments, the third support comprises a plurality ofpolynucleotides immobilized by hybridization to a plurality of anchoroligonucleotides

In some embodiments, the step of generating the plurality of firstconstruction oligonucleotides comprises annealing a primer sequencehaving at least one uracil to the first plurality of support-boundoligonucleotides under conditions promoting extension of the primer andremoving the primer using a mixture of Uracil DNA glycosylase (UDG) anda DNA glycosylase-lyase Endonuclease VIII. The second support can bepositioned above and facing the first support. In some embodiments, thesolution comprises a ligase allowing for the ligation of the second andthird pluralities of construction oligonucleotides.

In some embodiments, the step of releasing the first plurality ofconstruction oligonucleotides in solution allows for the diffusion ofthe first plurality of oligonucleotides towards the anchoroligonucleotides.

In some embodiments, the step of releasing the first plurality ofconstruction oligonucleotides in solution is in presence of a permeablemembrane allowing for a substantial vertical diffusion of theconstruction oligonucleotides towards the anchor oligonucleotides. Insome embodiments, the permeable membrane decreases the lateral diffusionof construction oligonucleotides.

In some embodiments, each feature of the second support comprises aplurality of oligonucleotides wherein the plurality of oligonucleotidescomprises at least two populations of oligonucleotides having differentpredefined sequences, the at least two populations of oligonucleotideshaving complementary sequences. For example, the two populations ofoligonucleotides comprise 3′ end complementary sequences. In someembodiments, the two populations of oligonucleotides are released insolution thereby allowing for the hybridization of the first populationof construction oligonucleotides to the second population ofconstruction oligonucleotides and for the hybridization of the firstpopulation of oligonucleotides to the anchor oligonucleotides. In someembodiments, the solution comprises a ligase. In some embodiments, thestoichiometry of the first plurality of construction oligonucleotides ishigher than the stoichiometry of the anchor oligonucleotides.

In some embodiments, the method of synthesizing a plurality ofpolynucleotides having a predefined sequence further comprises exposingthe plurality of polynucleotides to a mismatch recognizing and cleavingcomponent under conditions suitable for cleavage of double-strandedpolynucleotides containing a mismatch. The plurality of polynucleotidescan be support-bound or in solution. The mismatch recognizing andcleaving component can comprise a mismatch endonuclease such as a CEL Ienzyme.

BRIEF DESCRIPTION OF THE FIGURES

FIGS. 1a-1e illustrate a non-limiting exemplary method of surfaceattached nucleic acid synthesis using a first oligonucleotide generatingsurface and a second anchor oligonucleotide surface.

FIGS. 2a-2e illustrate a non-limiting exemplary method of surfaceattached nucleic acid synthesis using a single surface comprisingoligonucleotide generating sequences and anchor sequences.

FIGS. 3a-3e illustrate a non-limiting exemplary method of surfaceattached nucleic acid synthesis and screening for full length assembledpolynucleotides using a single-strand specific exonuclease.

FIGS. 4a-4c illustrate a non-limiting exemplary method of surfaceattached nucleic acid synthesis and screening for full length assembledpolynucleotides using a molecular ruler.

FIG. 5 illustrates non-limiting exemplary construction arrays and anchorarray comprising support-bound oligonucleotides.

FIG. 6 illustrates a non-limiting method for the synthesis ofconstruction oligonucleotides from support-bound oligonucleotidesimmobilized on construction arrays.

FIGS. 7a-7b illustrate a non-limiting method for highly parallelsequential surface-attached polynucleotide synthesis.

FIG. 8 illustrates an ensemble of construction arrays and an anchorarray.

FIGS. 9a-9f illustrate a non-limiting methods for the transfer a firstset and second set of construction oligonucleotides from theconstruction array to the anchor array in a fluid medium.

FIGS. 10a-10b illustrate a non-limiting method for the transfer ofconstruction oligonucleotides from the construction array to the anchorarray in a fluid medium in presence of a porous membrane.

FIGS. 11a-11b illustrates a non-limiting method for the transfer of twodifferent construction oligonucleotides from the construction array tothe anchor array in a fluid medium and in presence of a ligase.

FIGS. 12a-12b illustrate a non-limiting method for the transfer of twodifferent construction oligonucleotides from the construction array tothe anchor array in a fluid medium wherein the number of constructionoligonucleotides is in stochiometric excess to each corresponding anchoroligonucleotide.

FIGS. 13a-13b illustrate a non-limiting method to transfer assembledpolynucleotide from one anchor array to those on another anchor array byuse of an overlapping junction between the polynucleotides assembled oneach anchor array.

FIGS. 14a-14f illustrate a non-limiting method for mismatch errorremoval from double-stranded nucleic acid sequences using amismatch-specific endonuclease. The mismatch nucleotide is indicated bya cross.

DETAILED DESCRIPTION OF THE INVENTION

Aspects of the technology provided herein are useful for increasing theaccuracy, yield, throughput, and/or cost efficiency of nucleic acidsynthesis and assembly reactions. As used herein the terms “nucleicacid”, “polynucleotide”, “oligonucleotide” are used interchangeably andrefer to naturally-occurring or synthetic polymeric forms ofnucleotides. The oligonucleotides and nucleic acid molecules of thepresent invention may be formed from naturally occurring nucleotides,for example forming deoxyribonucleic acid (DNA) or ribonucleic acid(RNA) molecules. Alternatively, the naturally occurring oligonucleotidesmay include structural modifications to alter their properties, such asin peptide nucleic acids (PNA) or in locked nucleic acids (LNA). Thesolid phase synthesis of oligonucleotides and nucleic acid moleculeswith naturally occurring or artificial bases is well known in the art.The terms should be understood to include equivalents, analogs of eitherRNA or DNA made from nucleotide analogs and as applicable to theembodiment being described, single-stranded or double-strandedpolynucleotides. Nucleotides useful in the invention include, forexample, naturally-occurring nucleotides (for example, ribonucleotidesor deoxyribonucleotides), or natural or synthetic modifications ofnucleotides, or artificial bases. As used herein, the term monomerrefers to a member of a set of small molecules which are and can bejoined together to form an oligomer, a polymer or a compound composed oftwo or more members. The particular ordering of monomers within apolymer is referred to herein as the “sequence” of the polymer. The setof monomers includes, but is not limited to, for example, the set ofcommon L-amino acids, the set of D-amino acids, the set of syntheticand/or natural amino acids, the set of nucleotides and the set ofpentoses and hexoses. Aspects of the invention are described hereinprimarily with regard to the preparation of oligonucleotides, but couldreadily be applied in the preparation of other polymers such as peptidesor polypeptides, polysaccharides, phospholipids, heteropolymers,polyesters, polycarbonates, polyureas, polyamides, polyethyleneimines,polyarylene sulfides, polysiloxanes, polyimides, polyacetates, or anyother polymers.

As used herein, the term “predetermined sequence” or “predefinedsequence” are used interchangeably and means that the sequence of thepolymer is known and chosen before synthesis or assembly of the polymer.In particular, aspects of the invention are described herein primarilywith regard to the preparation of nucleic acid molecules, the sequenceof the nucleic acids being known and chosen before the synthesis orassembly of the nucleic acid molecules. In some embodiments of thetechnology provided herein, immobilized oligonucleotides orpolynucleotides are used as a source of material. In variousembodiments, the methods described herein use oligonucleotides, theirsequence being determined based on the sequence of the finalpolynucleotide constructs to be synthesized. In one embodiment,oligonucleotides are short nucleic acid molecules. For example,oligonucleotides may be from 10 to about 300 nucleotides, from 20 toabout 400 nucleotides, from 30 to about 500 nucleotides, from 40 toabout 600 nucleotides, or more than about 600 nucleotides long. However,shorter or longer oligonucleotides may be used. Oligonucleotides may bedesigned to have different length. In some embodiments, the sequence ofthe polynucleotide construct may be divided up into a plurality ofshorter sequences that can be synthesized in parallel and assembled intoa single or a plurality of desired polynucleotide constructs using themethods described herein. In some embodiments, the assembly proceduremay include several parallel and/or sequential reaction steps in which aplurality of different nucleic acids or oligonucleotides are synthesizedor immobilized, primer-extended, and are combined in order to beassembled (e.g., by extension or ligation as described herein) togenerate a longer nucleic acid product to be used for further assembly,cloning, or other applications.

In some embodiments, methods of assembling libraries containing nucleicacids having predetermined sequence variations are provided herein.Assembly strategies provided herein can be used to generate very largelibraries representative of many different nucleic acid sequences ofinterest. In some embodiments, libraries of nucleic acids are librariesof sequence variants. Sequence variants may be variants of a singlenaturally-occurring protein encoding sequence. However, in someembodiments, sequence variants may be variants of a plurality ofdifferent protein-encoding sequences. Accordingly, one aspect of thetechnology provided herein relates to the assembly of precisehigh-density nucleic acid libraries. Aspects of the technology providedherein also provide precise high-density nucleic acid libraries. Ahigh-density nucleic acid library may include more that 100 differentsequence variants (e.g., about 10² to 10³; about 10³ to 10⁴; about 10⁴to 10⁵; about 10⁵ to 10⁶; about 10⁶ to 10⁷; about 10⁷ to 10⁸; about 10⁸to 10⁹; about 10⁹ to 10¹⁰; about 10¹⁰ to 10″; about 10¹¹ to 10¹²; about10¹² to 10¹³; about 10¹³ to 10¹⁴; about 10¹⁴ to 10¹⁵; or more differentsequences) wherein a high percentage of the different sequences arespecified sequences as opposed to random sequences (e.g., more thanabout 50%, more than about 60%, more than about 70%, more than about75%, more than about 80%, more than about 85%, more than about 90%,about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about97%, about 98%, about 99%, or more of the sequences are predeterminedsequences of interest).

In some embodiments, the methods and devices provided herein useoligonucleotides that are immobilized on a surface or substrate (e.g.,support-bound oligonucleotides). Support-bound oligonucleotides comprisefor example, oligonucleotides complementary to constructionoligonucleotides, anchor oligonucleotides and/or spaceroligonucleotides. As used herein the terms “support”, “substrate” and“surface” are used interchangeably and refer to a porous or non-poroussolvent insoluble material on which polymers such as nucleic acids aresynthesized or immobilized. As used herein “porous” means that thematerial contains pores having substantially uniform diameters (forexample in the nm range). Porous materials include paper, syntheticfilters etc. In such porous materials, the reaction may take placewithin the pores. The support can have any one of a number of shapes,such as pin, strip, plate, disk, rod, bends, cylindrical structure,particle, including bead, nanoparticles and the like. The support canhave variable widths. The support can be hydrophilic or capable of beingrendered hydrophilic and includes inorganic powders such as silica,magnesium sulfate, and alumina; natural polymeric materials,particularly cellulosic materials and materials derived from cellulose,such as fiber containing papers, e.g., filter paper, chromatographicpaper, etc.; synthetic or modified naturally occurring polymers, such asnitrocellulose, cellulose acetate, poly (vinyl chloride),polyacrylamide, cross linked dextran, agarose, polyacrylate,polyethylene, polypropylene, poly (4-methylbutene), polystyrene,polymethacrylate, poly(ethylene terephthalate), nylon, poly(vinylbutyrate), polyvinylidene difluoride (PVDF) membrane, glass, controlledpore glass, magnetic controlled pore glass, ceramics, metals, and thelike etc.; either used by themselves or in conjunction with othermaterials. In some embodiments, oligonucleotides are synthesized in anarray format. For example, single-stranded oligonucleotides aresynthesized in situ on a common support, wherein each oligonucleotide issynthesized on a separate or discrete feature (or spot) on thesubstrate. In preferred embodiments, single-stranded oligonucleotidesare bound to the surface of the support or feature. As used herein theterm “array” refers to an arrangement of discrete features for storing,amplifying and releasing oligonucleotides or complementaryoligonucleotides for further reactions. In a preferred embodiment, thesupport or array is addressable: the support includes two or morediscrete addressable features at a particular predetermined location(i.e., an “address”) on the support. Therefore, each oligonucleotidemolecule on the array is localized to a known and defined location onthe support. The sequence of each oligonucleotide can be determined fromits position on the support. The array may comprise interfeaturesregions. Interfeatures may not carry any oligonucleotide on theirsurface and may correspond to inert space.

In some embodiments, oligonucleotides are attached, spotted,immobilized, surface-bound, supported or synthesized on the discretefeatures of the surface or array. Oligonucleotides may be covalentlyattached to the surface or deposited on the surface. Arrays may beconstructed, custom ordered or purchased from a commercial vendor (e.g.,Agilent, Affymetrix, Nimblegen). Various methods of construction arewell known in the art e.g., maskless array synthesizers, light directedmethods utilizing masks, flow channel methods, spotting methods etc. Insome embodiments, construction and/or selection oligonucleotides may besynthesized on a solid support using maskless array synthesizer (MAS).Maskless array synthesizers are described, for example, in PCTapplication No. WO 99/42813 and in corresponding U.S. Pat. No.6,375,903. Other examples are known of maskless instruments which canfabricate a custom DNA microarray in which each of the features in thearray has a single-stranded DNA molecule of desired sequence. Othermethods for synthesizing oligonucleotides include, for example,light-directed methods utilizing masks, flow channel methods, spottingmethods, pin-based methods, and methods utilizing multiple supports.Light directed methods utilizing masks (e.g., VLSIPS™ methods) for thesynthesis of oligonucleotides is described, for example, in U.S. Pat.Nos. 5,143,854, 5,510,270 and 5,527,681. These methods involveactivating predefined regions of a solid support and then contacting thesupport with a preselected monomer solution. Selected regions can beactivated by irradiation with a light source through a mask much in themanner of photolithography techniques used in integrated circuitfabrication. Other regions of the support remain inactive becauseillumination is blocked by the mask and they remain chemicallyprotected. Thus, a light pattern defines which regions of the supportreact with a given monomer. By repeatedly activating different sets ofpredefined regions and contacting different monomer solutions with thesupport, a diverse array of polymers is produced on the support. Othersteps, such as washing unreacted monomer solution from the support, canbe optionally used. Other applicable methods include mechanicaltechniques such as those described in U.S. Pat. No. 5,384,261.Additional methods applicable to synthesis of oligonucleotides on asingle support are described, for example, in U.S. Pat. No. 5,384,261.For example, reagents may be delivered to the support by either (1)flowing within a channel defined on predefined regions or (2) “spotting”on predefined regions. Other approaches, as well as combinations ofspotting and flowing, may be employed as well. In each instance, certainactivated regions of the support are mechanically separated from otherregions when the monomer solutions are delivered to the various reactionsites. Flow channel methods involve, for example, microfluidic systemsto control synthesis of oligonucleotides on a solid support. Forexample, diverse polymer sequences may be synthesized at selectedregions of a solid support by forming flow channels on a surface of thesupport through which appropriate reagents flow or in which appropriatereagents are placed. Spotting methods for preparation ofoligonucleotides on a solid support involve delivering reactants inrelatively small quantities by directly depositing them in selectedregions. In some steps, the entire support surface can be sprayed orotherwise coated with a solution, if it is more efficient to do so.Precisely measured aliquots of monomer solutions may be depositeddropwise by a dispenser that moves from region to region. Pin-basedmethods for synthesis of oligonucleotides on a solid support aredescribed, for example, in U.S. Pat. No. 5,288,514. Pin-based methodsutilize a support having a plurality of pins or other extensions. Thepins are each inserted simultaneously into individual reagent containersin a tray. An array of 96 pins is commonly utilized with a 96-containertray, such as a 96-well microtiter dish. Each tray is filled with aparticular reagent for coupling in a particular chemical reaction on anindividual pin. Accordingly, the trays will often contain differentreagents. Since the chemical reactions have been optimized such thateach of the reactions can be performed under a relatively similar set ofreaction conditions, it becomes possible to conduct multiple chemicalcoupling steps simultaneously.

In another embodiment, a plurality of oligonucleotides may besynthesized or immobilized on multiple supports. One example is a beadbased synthesis method which is described, for example, in U.S. Pat.Nos. 5,770,358; 5,639,603; and 5,541,061. For the synthesis of moleculessuch as oligonucleotides on beads, a large plurality of beads issuspended in a suitable carrier (such as water) in a container. Thebeads are provided with optional spacer molecules having an active siteto which is complexed, optionally, a protecting group. At each step ofthe synthesis, the beads are divided for coupling into a plurality ofcontainers. After the nascent oligonucleotide chains are deprotected, adifferent monomer solution is added to each container, so that on allbeads in a given container, the same nucleotide addition reactionoccurs. The beads are then washed of excess reagents, pooled in a singlecontainer, mixed and re-distributed into another plurality of containersin preparation for the next round of synthesis. It should be noted thatby virtue of the large number of beads utilized at the outset, therewill similarly be a large number of beads randomly dispersed in thecontainer, each having a unique oligonucleotide sequence synthesized ona surface thereof after numerous rounds of randomized addition of bases.An individual bead may be tagged with a sequence which is unique to thedouble-stranded oligonucleotide thereon, to allow for identificationduring use.

Pre-synthesized oligonucleotide and/or polynucleotide sequences may beattached to a support or synthesized in situ using light-directedmethods, flow channel and spotting methods, inkjet methods, pin-basedmethods and bead-based methods set forth in the following references:McGall et al. (1996) Proc. Natl. Acad. Sci. U.S.A. 93:13555; SyntheticDNA Arrays In Genetic Engineering, Vol. 20:111, Plenum Press (1998);Duggan et al. (1999) Nat. Genet. S21:10; Microarrays: Making Them andUsing Them In Microarray Bioinformatics, Cambridge University Press,2003; U.S. Patent Application Publication Nos. 2003/0068633 and2002/0081582; U.S. Pat. Nos. 6,833,450, 6,830,890, 6,824,866, 6,800,439,6,375,903 and 5,700,637; and PCT Publication Nos. WO 04/031399, WO04/031351, WO 04/029586, WO 03/100012, WO 03/066212, WO 03/065038, WO03/064699, WO 03/064027, WO 03/064026, WO 03/046223, WO 03/040410 and WO02/24597; the disclosures of which are incorporated herein by referencein their entirety for all purposes. In some embodiments, pre-synthesizedoligonucleotides are attached to a support or are synthesized using aspotting methodology wherein monomers solutions are deposited dropwiseby a dispenser that moves from region to region (e.g., ink jet). In someembodiments, oligonucleotides are spotted on a support using, forexample, a mechanical wave actuated dispenser.

In one aspect, the invention relates to a method for producing targetpolynucleotides having a predefined sequence on a solid support. Thesynthetic polynucleotides are at least about 1, 2, 3, 4, 5, 8, 10, 15,20, 25, 30, 40, 50, 75, or 100 kilobases (kb), or 1 megabase (mb), orlonger. In some aspects, the invention relate to a method for theproduction of high fidelity polynucleotides. In exemplary embodiments, acompositions of synthetic polynucleotides contains at least about 1%,2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 50%, 60%, 70%, 80%,90%, 95% or more, copies that are error free (e.g., having a sequencethat does not deviate from a predetermined sequence). The percent oferror free copies is based on the number of error free copies in thecomposition as compared to the total number of copies of thepolynucleotide in the composition that were intended to have thecorrect, e.g., predefined or predetermined, sequence.

In some embodiments, the nucleic acid target sequence can be obtained ina single step by mixing together all of the overlapping oligonucleotidesneeded to form the polynucleotide construct having the predefinedsequence. Alternatively, a series of assembly reactions may be performedin parallel or serially, such that larger polynucleotide constructs maybe assembled from a series of separate assembly reactions.

Some aspects the invention relate to the design of oligonucleotides forthe high fidelity polynucleotide assembly. Aspects of the invention maybe useful to increase the throughput rate of a nucleic acid assemblyprocedure and/or reduce the number of steps or amounts of reagent usedto generate a correctly assembled nucleic acid sequence. In certainembodiments, aspects of the invention may be useful in the context ofautomated nucleic acid assembly to reduce the time, number of steps,amount of reagents, and other factors required for the assembly of eachcorrect nucleic acid sequence. Accordingly, these and other aspects ofthe invention may be useful to reduce the cost and time of one or morenucleic acid assembly procedures.

Some aspects of the invention relate to a polynucleotide assemblyprocess wherein synthetic oligonucleotides are designed and used astemplates for primer extension reactions, synthesis of complementaryoligonucleotides and to assemble polynucleotides into longerpolynucleotides constructs. In some embodiments, the method includessynthesizing a plurality of oligonucleotides or polynucleotides in achain extension reaction using a first plurality of single-strandedoligonucleotides as templates. As noted above, the oligonucleotides maybe first synthesized onto a plurality of discrete features of thesurface, or may be deposited on the plurality of features of thesupport. The support may comprise at least 100, at least 1,000, at least10⁴, at least 10⁵, at least 10⁶, at least 10⁷, at least 10⁸ features. Ina preferred embodiment, the oligonucleotides are covalently attached tothe support. In preferred embodiments, the pluralities ofoligonucleotides are immobilized to a solid surface. In a preferredembodiment, each feature of the solid surface comprises a high densityof oligonucleotides having a different predetermined sequence (e.g.,approximately 10⁶-10⁸ molecules per feature).

In some embodiments, pluralities of different single-strandedoligonucleotides are immobilized at different features of a solidsupport. In some embodiments, the support-bound oligonucleotides may beattached through their 5′ end. In a preferred embodiment, thesupport-bound oligonucleotides are attached through their 3′ end. Insome embodiments, the support-bound oligonucleotides may be immobilizedon the support via a nucleotide sequence (e.g. degenerate bindingsequence), linker or spacer (e.g. photocleavable linker or chemicallinker). It should be appreciated that by 3′ end, it is meant thesequence downstream to the 5′ end and by 5′ end it is meant the sequenceupstream to the 3′ end. For example, an oligonucleotide may beimmobilized on the support via a nucleotide sequence, linker or spacerthat is not involved in hybridization. The 3′ end sequence of thesupport-bound oligonucleotide referred then to a sequence upstream tothe linker or spacer.

In certain embodiments, oligonucleotides may be designed to have asequence that is identical or complementary to a different portion ofthe sequence of a predetermined target polynucleotide that is to beassembled. Accordingly, in some embodiments, each oligonucleotide mayhave a sequence that is identical or complementary to a portion of oneof the two strands of a double-stranded target nucleic acid. As usedherein, the term “complementary” refers to the capacity for precisepairing between two nucleotides. For example, if a nucleotide at a givenposition of a nucleic acid is capable of hydrogen bonding with anucleotide of another nucleic acid, then the two nucleic acids areconsidered to be complementary to one another at that position.Complementarity between two single-stranded nucleic acid molecules maybe “partial,” in which only some of the nucleotides bind, or it may becomplete when total complementarity exists between the single-strandedmolecules.

In some embodiments, the plurality of construction oligonucleotides aredesigned such as each plurality of construction oligonucleotidescomprising a sequence region at its 5′ end that is complementary tosequence region of the 5′ end of another construction oligonucleotideand a sequence region at its 3′ end that is complementary to a sequenceregion at a 3′ end of a different construction oligonucleotide. As usedherein, a “construction” oligonucleotide refers to one of the pluralityor population of single-stranded oligonucleotides used forpolynucleotide assembly. The plurality of construction oligonucleotidescomprises oligonucleotides for both the sense and antisense strand ofthe target polynucleotide. Construction oligonucleotides can have anylength, the length being designed to accommodate an overlap orcomplementary sequence. Construction oligonucleotides can be ofidentical size or of different sizes. In preferred embodiments, theconstruction oligonucleotides span the entire sequence of the targetpolynucleotide without any gaps. Yet in other embodiments, theconstruction oligonucleotides are partially overlapping resulting ingaps between construction oligonucleotides when hybridized to eachother. Preferably, the pool or population of constructionoligonucleotides comprises construction oligonucleotides havingoverlapping sequences so that construction oligonucleotides canhybridize to one another under the appropriate hybridization conditions.One would appreciate that each internal construction oligonucleotideswill hybridize to two different construction oligonucleotide whereas theconstruction oligonucleotides at the 5′ and/or 3′ end will hybridizeeach to a different (or the same) internal oligonucleotide(s).Hybridization and ligation of the overlapping constructionoligonucleotides will therefore result in a target polynucleotide havinga 3′ and/or a 5′ overhang. Yet in some embodiments, the resulting targetpolynucleotide may comprise blunt end at its 5′ or/and 3′ terminus. Insome embodiments, if the target polynucleotide is assembled from Nconstruction oligonucleotides, 1 to N pluralities of differentsupport-bound single-stranded oligonucleotides are designed such as thefirst plurality of construction oligonucleotides comprises at its 5′ enda sequence region that is complementary to a sequence region at the 5′end of an anchor oligonucleotide and wherein a N plurality ofconstruction oligonucleotides comprises at its 3′ end a sequence regionthat is complementary to a 3′ end sequence region of the (N-1)construction oligonucleotide. In some embodiments, the first pluralityof oligonucleotides has a 5′ end that is complementary to the 5′ end ofa support bound anchor single-stranded oligonucleotide. As used herein,the anchor oligonucleotide refers to an oligonucleotide designed to becomplementary to at least a portion of the target polynucleotide and maybe immobilized on the support. In an exemplary embodiment, the anchoroligonucleotide has a sequence complementary to the 5′ end of the targetpolynucleotide and may be immobilized on the support.

It should be appreciated that different oligonucleotides may be designedto have different lengths with overlapping sequence regions. Overlappingsequence regions may be identical (i.e., corresponding to the samestrand of the nucleic acid fragment) or complementary (i.e.,corresponding to complementary strands of the nucleic acid fragment).Overlapping sequences may be of any suitable length. Overlappingsequences may be between about 5 and about 500 nucleotides long (e.g.,between about 10 and 100, between about 10 and 75, between about 10 and50, about 20, about 25, about 30, about 35, about 40, about 45, about50, etc. . . . nucleotides long) However, shorter, longer orintermediate overlapping lengths may be used. It should be appreciatedthat overlaps (5′ or 3′ regions) between different input nucleic acidsused in an assembly reaction may have different lengths. In someembodiments, anchor support-bound (or immobilized) oligonucleotidesinclude sequence regions having overlapping regions to assist in theassembly of a predetermined nucleic acid sequence. In a preferredembodiment, anchor oligonucleotides include sequence regions havingcomplementary regions for hybridization to a different oligonucleotideor to a polynucleotide (such as, for example, a sub-assembly product).The complementary regions refer to a sequence region at either a 3′ endor a 5′ end of the immobilized template oligonucleotide (e.g. templateoligonucleotide). In a preferred embodiment, the complementary region islocalized at the 5′ end of the anchor oligonucleotides. Complementaryregions refer to a 3′ end or a 5′ region of a first oligonucleotide orpolynucleotide that is capable of hybridizing to a 5′ end or 3′ end of asecond oligonucleotide or polynucleotide.

In some embodiments, nucleic acids are assembled using ligase-basedassembly techniques, wherein the oligonucleotides are designed toprovide full length sense (or plus strand) and antisense (or minusstrand) strands of the target polynucleotide construct. Afterhybridization of the sense and antisense oligonucleotides, theoligonucleotides on each strand are subjected to ligation in order toform the target polynucleotide construct or a sub-assembly product.Reference is made to U.S. Pat. No. 5,942,609, which is incorporatedherein in its entirety. Ligase-based assembly techniques may involve oneor more suitable ligase enzymes that can catalyze the covalent linkingof adjacent 3′ and 5′ nucleic acid termini (e.g., a 5′ phosphate and a3′ hydroxyl of nucleic acid(s) annealed on a complementary templatenucleic acid such that the 3′ terminus is immediately adjacent to the 5′terminus). Accordingly, a ligase may catalyze a ligation reactionbetween the 5′ phosphate of a first nucleic acid to the 3′ hydroxyl of asecond nucleic acid if the first and second nucleic acids are annealednext to each other on a template nucleic acid. A ligase may be obtainedfrom recombinant or natural sources. A ligase may be a heat-stableligase. In some embodiments, a thermostable ligase from a thermophilicorganism may be used. Examples of thermostable DNA ligases include, butare not limited to: Tth DNA ligase (from Thermus thermophilus, availablefrom, for example, Eurogentec and GeneCraft); Pfu DNA ligase (ahyperthermophilic ligase from Pyrococcus furiosus); Taq ligase (fromThermus aquaticus), Ampliligase® (available from EpicenterBiotechnologies) any other suitable heat-stable ligase, or anycombination thereof. In some embodiments, one or more lower temperatureligases may be used (e.g., T4 DNA ligase). A lower temperature ligasemay be useful for shorter overhangs (e.g., about 3, about 4, about 5, orabout 6 base overhangs) that may not be stable at higher temperatures.

Non-enzymatic techniques can be used to ligate nucleic acids. Forexample, a 5′-end (e.g., the 5′ phosphate group) and a 3′-end (e.g., the3′ hydroxyl) of one or more nucleic acids may be covalently linkedtogether without using enzymes (e.g., without using a ligase). In someembodiments, non-enzymatic techniques may offer certain advantages overenzyme-based ligations. For example, non-enzymatic techniques may have ahigh tolerance of non-natural nucleotide analogues in nucleic acidsubstrates, may be used to ligate short nucleic acid substrates, may beused to ligate RNA substrates, and/or may be cheaper and/or more suitedto certain automated (e.g., high throughput) applications.

Non-enzymatic ligation may involve a chemical ligation. In someembodiments, nucleic acid termini of two or more different nucleic acidsmay be chemically ligated. In some embodiments, nucleic acid termini ofa single nucleic acid may be chemically ligated (e.g., to circularizethe nucleic acid). It should be appreciated that both strands of a firstdouble-stranded nucleic acid terminus may be chemically ligated to bothstrands at a second double-stranded nucleic acid terminus. However, insome embodiments only one strand of a first nucleic acid terminus may bechemically ligated to a single strand of a second nucleic acid terminus.For example, the 5′ end of one strand of a first nucleic acid terminusmay be ligated to the 3′ end of one strand of a second nucleic acidterminus without the ends of the complementary strands being chemicallyligated.

Accordingly, a chemical ligation may be used to form a covalent linkagebetween a 5′ terminus of a first nucleic acid end and a 3′ terminus of asecond nucleic acid end, wherein the first and second nucleic acid endsmay be ends of a single nucleic acid or ends of separate nucleic acids.In one aspect, chemical ligation may involve at least one nucleic acidsubstrate having a modified end (e.g., a modified 5′ and/or 3′ terminus)including one or more chemically reactive moieties that facilitate orpromote linkage formation. In some embodiments, chemical ligation occurswhen one or more nucleic acid termini are brought together in closeproximity (e.g., when the termini are brought together due to annealingbetween complementary nucleic acid sequences). Accordingly, annealingbetween complementary 3′ or 5′ overhangs (e.g., overhangs generated byrestriction enzyme cleavage of a double-stranded nucleic acid) orbetween any combination of complementary nucleic acids that results in a3′ terminus being brought into close proximity with a 5′ terminus (e.g.,the 3′ and 5′ termini are adjacent to each other when the nucleic acidsare annealed to a complementary template nucleic acid) may promote atemplate-directed chemical ligation. Examples of chemical reactions mayinclude, but are not limited to, condensation, reduction, and/orphoto-chemical ligation reactions. It should be appreciated that in someembodiments chemical ligation can be used to produce naturally-occurringphosphodiester internucleotide linkages, non-naturally-occurringphosphamide pyrophosphate internucleotide linkages, and/or othernon-naturally-occurring internucleotide linkages.

In some aspects of the invention, oligonucleotides are assembled bypolymerase chain extension. In some embodiments, the first step of theextension reaction uses a primer. In some embodiments, theoligonucleotides may comprise universal (common to alloligonucleotides), semi-universal (common to at least of portion of theoligonucleotides) or individual or unique primer (specific to eacholigonucleotide) binding sites on either the 5′ end or the 3′ end orboth ends. As used herein, the term “universal” primer or primer bindingsite means that a sequence used to amplify the oligonucleotide is commonto all oligonucleotides such that all such oligonucleotides can beamplified using a single set of universal primers. In othercircumstances, an oligonucleotide contains a unique primer binding site.As used herein, the term “unique primer binding site” refers to a set ofprimer recognition sequences that selectively amplifies a subset ofoligonucleotides. In yet other circumstances, an oligonucleotidecontains both universal and unique amplification sequences, which canoptionally be used sequentially. In a first step, a primer is added andanneals to an immobilized or support-bound oligonucleotide. For example,the primer can anneal to an immobilized anchor oligonucleotide. In someembodiments, the primer is designed to be complementary to a sequence ofthe support-bound or immobilized oligonucleotides, referred to as primerbinding site. In the first step, a solution comprising a polymerase, atleast one primer and dNTPs, is added at a feature of the solid supportunder conditions promoting primer extension. For example, referring toFIG. 1b , a primer (50) is added at a feature comprisingoligonucleotides (1′, 2′, 3′, and 4′). The primer hybridizes to theprimer binding site of the support-bound oligonucleotides and underconditions promoting primer extension, the primer is extended into acomplementary oligonucleotide (1, 2, 3 or 4) using support-boundsequence (1′, 2′ 3′ or 4′) as a template.

In some embodiments, uracil DNA glycosylase (UDG) may be used tohydrolyze a uracil-glycosidic bond in a nucleic acid thereby removinguracil and creating an alkali-sensitive basic site in the DNA which canbe subsequently hydrolyzed by endonuclease, heat or alkali treatment. Asa result, a portion of one strand of a double-stranded nucleic acid maybe removed thereby exposing the complementary sequence in the form of asingle-stranded overhang. This approach requires the deliberateincorporation of one or more uracil bases in one strand of adouble-stranded nucleic acid fragment. This may be accomplished, forexample, by amplifying a nucleic acid fragment using an amplificationprimer that contains a 3′ terminal uracil. In some embodiments, theprimer is a primer containing multiple uracil (U). The primer is firstannealed to a support-bound single-stranded oligonucleotide and extendedwith the addition of dNTPs and an appropriate polymerase underappropriate conditions and temperature. In a subsequent step, the primermay be removed. After treatment with UDG, the region of the primer 5′ tothe uracil may be released (e.g., upon dilution, incubation, exposure tomild denaturing conditions, etc.) thereby exposing the complementarysequence as a single-stranded overhang. It should be appreciated thatthe length of the overhang may be determined by the position of theuracil on the amplifying primer and by the length of the amplifyingprimer. In some embodiments, mixture of Uracil DNA glycosylase (UDG) andthe DNA glycosylase-lyase Endonuclease VIII, such as USER™(Uracil-Specific Excision Reagent, New England Biolabs) is used. UDGcatalyses the excision of a uracil base, forming an abasic site whileleaving the phosphodiester backbone intact. The lyase activity ofEndonuclease VIII breaks the phosphodiester backbone at the 3′ and 5′sides of the abasic site so that base-free deoxyribose is released. Insubsequent steps, the primer may be removed.

One should appreciate that the extension reactions can take place in asingle volume that encompasses all of the utilized features comprisingthe support-bound oligonucleotides (1′, 2′, 3′ and 4′) or each step cantake place in a localized individual microvolume that contains only theregion(s) of interest to undergo a specific extension step. In someembodiments, the extension and/or assembly reactions are performedwithin a microdroplet (see PCT Application PCT/US2009/55267 and PCTApplication PCT/US2010/055298, each of which is incorporate herein byreference in their entirety).

Primer extension may involve one or more suitable polymerase enzymesthat can catalyze a template-based extension of a nucleic acid in a 5′to 3′ direction in the presence of suitable nucleotides and an annealedtemplate. A polymerase may be thermostable. A polymerase may be obtainedfrom recombinant or natural sources. In some embodiments, a thermostablepolymerase from a thermophilic organism may be used. In someembodiments, a polymerase may include a 3′→5′ exonuclease/proofreadingactivity. In some embodiments, a polymerase may have no, or little,proofreading activity (e.g., a polymerase may be a recombinant variantof a natural polymerase that has been modified to reduce itsproofreading activity). Examples of thermostable DNA polymerasesinclude, but are not limited to: Taq (a heat-stable DNA polymerase fromthe bacterium Thermus aquaticus); Pfu (a thermophilic DNA polymerasewith a 3′→5′ exonuclease/proofreading activity from Pyrococcus furiosus,available from for example Promega); VENT® DNA Polymerase and VENT®(exo-) DNA Polymerase (thermophilic DNA polymerases with or without a3′→5′ exonuclease/proofreading activity from Thermococcus litoralis;also known as Th polymerase); Deep VENT® DNA Polymerase and Deep VENT®(exo-) DNA Polymerase (thermophilic DNA polymerases with or without a3′→5′ exonuclease/proofreading activity from Pyrococcus species GB-D;available from New England Biolabs); KOD HiFi (a recombinantThermococcus kodakaraensis KODI DNA polymerase with a 3′→5′exonuclease/proofreading activity, available from Novagen); BIO-X-ACT (amix of polymerases that possesses 5′-3′ DNA polymerase activity and3′→5′ proofreading activity); Klenow Fragment (an N-terminal truncationof E. coli DNA Polymerase I which retains polymerase activity, but haslost the 5′→3′ exonuclease activity, available from, for example,Promega and NEB); SEQUENASE™ (T7 DNA polymerase deficient in T-5′exonuclease activity); Phi29 (bacteriophage 29 DNA polymerase, may beused for rolling circle amplification, for example, in a TEMPLIPHI™ DNASequencing Template Amplification Kit, available from AmershamBiosciences); TopoTaq (a hybrid polymerase that combines hyperstable DNAbinding domains and the DNA unlinking activity of Methanopyrustopoisomerase, with no exonuclease activity, available from FidelitySystems); TopoTaq HiFi which incorporates a proofreading domain withexonuclease activity; PHUSION™ (a Pyrococcus-like enzyme with aprocessivity-enhancing domain, available from New England Biolabs); anyother suitable DNA polymerase, or any combination of two or morethereof. In some embodiments, the polymerase can be a SDP(strand-displacing polymerase; e.g, an SDPe—which is an SDP with noexonuclease activity). This allows isothermal PCR (isothermal extension,isothermal amplification) at a uniform temperature. As the polymerase(for example, Phi29, Bst) travels along a template it displaces thecomplementary strand (e.g., created in previous extension reactions). Asthe displaced DNAs are single-stranded, primers can bind at a consistenttemperature, removing the need for any thermocycling duringamplification.

In some embodiments, after extension or amplification, the polymerasemay be deactivated to prevent interference with the subsequent steps. Aheating step (e.g., high temperature) can denature and deactivate mostenzymes which are not thermally stable. Enzymes may be deactivated inpresence or in the absence of liquid. Heat deactivation on a dry supportmay have the advantage to deactivate the enzymes without any detrimentaleffect on the oligonucleotides. In some embodiments, a non-thermalstable version of the thermally stable PCR DNA Polymerase may be used,although the enzyme is less optimized for error rate and speed.Alternatively, Epoxy dATP can be use to inactivate the enzyme.

In one embodiment, a support is provided that comprises at least onefeature having a plurality of surface-bound single-strandedoligonucleotides. Each of the plurality of oligonucleotides is bound toa discrete feature of the support, and the predefined sequence of eachplurality of oligonucleotides attached to the feature is different fromthe predefined sequence of the oligonucleotides attached to a differentfeature. At least one plurality of oligonucleotides is synthesized in achain extension reaction on a first feature of the support bytemplate-dependent synthesis. In some embodiments, the entire support orarray containing the discrete features is subjected to thermocycling,annealing temperature conditions, stringent melt temperature conditions,or denaturing temperature conditions. Heating and cooling the supportcan be performed in any thermal cycle instrument. In other embodiments,one or more discrete features are subjected to specific temperatureconditions (annealing, extension, wash or melt). Thermocycling ofselected independent features (being separated from each others) can beperformed by locally heating at least one discrete feature. Discretefeatures may be locally heated by any means known in the art. Forexample, the discrete features may be locally heated using a lasersource of energy that can be controlled in a precise x-y dimensionthereby individually modulating the temperature of a droplet. In anotherexample, the combination of a broader beam laser with a mask can be usedto irradiate specific features. In some embodiments, methods to controltemperature on the support so that enzymatic reactions can take place ona support (PCR, ligation or any other temperature sensitive reaction)are provided. In some embodiments, a scanning laser is used to controlthe thermocycling on distinct features on the solid support. Thewavelength used can be chosen from wide spectrum (100 nm to 100,000 nm,i.e., from ultraviolet to infrared). In some embodiments, the featurescomprising the oligonucleotides comprise an optical absorber orindicator. In some embodiments, the solid support is cooled bycirculation of air or fluid. The energy to be deposited can becalculated based on the absorbance behavior. In some embodiments, thetemperature of the droplet can be modeled using thermodynamics. Thetemperature can be measured by an LCD like material or any other in-situtechnology. Yet in another embodiment, the whole support can be heatedand cooled down to allow enzymatic reactions or other temperaturesensitive reactions to take place. In some embodiments, an energy sourcecan be directed by a scanning setup to deposit energy at variouslocations on the surface of the solid support comprising support-boundmolecules. Optical absorbent material can be added on the surface of thesolid support. Optical energy source, such as a high intensity lamp,laser, or other electromagnetic energy source (including microwave) canbe used. The temperature of the different reaction sites can becontrolled independently by controlling the energy deposited at each ofthe features.

For example, a Digital Micromirror Device (DMD) can be used fortemperature control. DMD is an microfabricated spatial opticalmodulator. See, for example, U.S. Pat. No. 7,498,176. In someembodiments, a DMD can be used to precisely heat selected spots ordroplets on the solid support. The DMD can be a chip having on itssurface, for example, several hundred thousand to several millionmicroscopic mirrors arranged in a array which correspond to the spots ordroplets to be heated. The mirrors can be individually rotated (e.g.,±10-12°), to an on or off state. In the on state, light from a lightsource (e.g., a bulb) is reflected onto the solid support to heat theselected spots or droplets. In the off state, the light is directedelsewhere (e.g., onto a heatsink). In some embodiments, the array may bea rectangular array. In one example, the DMD can consist of a 1024×768array of 16 μm wide micromirrors. In another example, the DMD canconsist of a 1920×1080 array of 10 μm wide micromirrors. Otherarrangements of array sizes and micromirror widths are also possible.These mirrors can be individually addressable and can be used to createany given pattern or arrangement in heating different spots on the solidsupport. The spots can also be heated to different temperatures, e.g.,by providing different wavelength for individual spots, and/orcontrolling time of irradiation. In certain embodiments, the DMD candirect light to selected spots and used to identify, select, melt,and/or cleave any oligonucleotide of choice.

FIGS. 1a-1e show an exemplary method for producing polynucleotide havinga predetermined sequence on a substrate or solid support. In someembodiments, polynucleotides may be assembled to synthesize the finalnucleic acid sequence (e.g. target nucleic acid). Referring to FIG. 1a ,a nucleic acid array 10 is shown possessing an arrangement of features20 in which each feature comprises a plurality of support-boundsingle-stranded oligonucleotides 30. Preferably, support-boundoligonucleotides are attached through their 3′ end. In some embodiments,support-bound single-stranded oligonucleotides are about 20 nucleotideslong, about 40 nucleotides long, about 50 nucleotides long, about 60nucleotides long, about 70 nucleotides long, about 80 nucleotides long,about 100 nucleotides long or more. In some embodiments, theoligonucleotides 30 further comprise a universal priming site at the 3′end (e.g. 15 bases primer binding site at the 3′ end) and a sequencecomplementary to a construction oligonucleotide (also referred asbuilding block, and designated as 1′, 2′, 3′ etc.). In some embodiments,the construction oligonucleotides are contiguous one with another andtogether make up or span the sequence of the target polynucleotide. Inpreferred embodiments, the construction oligonucleotides span the entiresequence of the target polynucleotide without any gaps. Yet in otherembodiments, the construction oligonucleotides are partially overlappingresulting in gaps between construction oligonucleotides when hybridizedto each other. Referring to FIG. 1a , the target polynucleotide isassembled from a population of construction oligonucleotides, the evennumbered construction oligonucleotides representing one strand of thedouble-stranded target polynucleotide (e.g. plus strand) and the unevennumbers representing a complementary strand of the double-strandedtarget polynucleotide (e.g. minus strand). Preferably, the pool ofconstruction oligonucleotides comprises construction oligonucleotideshaving overlapping sequences so that construction oligonucleotides canhybridize to one another under the appropriate hybridization conditions.One would appreciate that each internal construction oligonucleotidewill hybridize to two different construction oligonucleotides whereasthe construction oligonucleotides at the 5′ and/or 3′ terminus willhybridize each to a different (or the same) internal oligonucleotide(s).Hybridization of the overlapping construction oligonucleotides willtherefore result in a target polynucleotide having a 3′ and/or a 5′overhang. Yet in some embodiments, the resulting target polynucleotidemay comprise blunt end at its 5′ or/and 3′ terminus. The constructionoligonucleotides may subsequently be ligated to form a covalently linkeddouble-stranded nucleic acid construct (FIG. 1d ) using ligationassembly techniques known in the art.

Referring to FIG. 1b , at least one feature on support 10 comprising thesupport-bound oligonucleotides is incubated with a primer 50. In a firststep, the primer is first annealed to the immobilized single-strandedoligonucleotide and extended in presence of appropriate polymerase anddNTPs, under appropriate extension conditions, to form constructionoligonucleotides 60 (designated as 1, 2, 3 ,4) which are complimentaryto the support-bound oligonucleotides (1′, 2′ 3′, 4′). In someembodiments, the primer is a primer containing multiple Uracil (U). In asubsequent step, the primer is removed. Preferably, an USER™endonuclease is added to digest the primer. Digestion of the primers maytake place subsequent to the extension step, thereby generating a duplexcomprising the construction oligonucleotides hybridized to thesupport-bound oligonucleotides (e.g. 1-1′, 2-2′ etc. . . . ). Yet inother embodiments, digestion of the primers occurs in solution afterrelease of the construction oligonucleotides in solution (FIG. 1c ).

In a second step (FIG. 1c ), the newly synthesized extension products(construction oligonucleotides 65: 1, 2, 3, and/or 4) are melted andreleased from the support 10. Dissociation may be performed in parallelor sequentially. The construction oligonucleotides may be released insolution. In one embodiment, the solution is a buffer comprising 10 mMTris, 50 mM sodium chloride, and 1 mM EDTA. Melting of the duplex may beperformed by increasing the temperature, for example, at specificlocation on the array, to a melting temperature (e.g. 95° C.).Alternatively, the duplex may be dissociated by addition of an enzymecapable to separate the double-stranded nucleic acids. Helicase enzymemay be added at specific location on the array. Helicase enzymes areknown in the art and have been shown to unwind DNA from adouble-stranded structure to a single-stranded structure. Thesingle-stranded extension product can be transferred to a second support15 comprising a first plurality of anchor support-boundoligonucleotides, the first plurality of anchor oligonucleotide sequencecomprising a sequence partially complementary to a first extensionproduct (e.g. construction oligonucleotide 1). The first extensionproduct is allowed to hybridize under appropriate conditions to thefirst plurality of anchor oligonucleotides. During the same reaction orsubsequently, the other extension products (or overlapping constructionoligonucleotides) are allowed to hybridize under the appropriateconditions to their complementary sequences, thereby forming a longerpolynucleotide sequence.

Referring to FIG. 1d , construction oligonucleotides 65 can betransferred to a new surface 15 comprising an anchor support-boundoligonucleotide 40 having a sequence complementary to the firstconstruction oligonucleotide (construction oligonucleotide 1).Additional construction oligonucleotides (construction oligonucleotides2, 3 , 4 etc) are designed to hybridize to each other through theiroverlapping regions, as shown, to form a longer nucleic acid construct70. In some embodiments, the anchor support-bound oligonucleotide ispreferably single-stranded. In some embodiments, the anchorsupport-bound oligonucleotide comprises a 5′ terminus complementary tothe 5′ terminus of a first plurality of oligonucleotides. The additionalconstruction oligonucleotides that together form the polynucleotidesequence comprise complementary 3′ termini and hybridize to each others.The inset, FIG. 1d , shows an example of oligonucleotides that have beendesigned to hybridize to each other to assemble into a longerpolynucleotide construct built onto anchor oligonucleotide 40. Ligase 80is introduced into solution to ligate each junction thus forming acovalently joined longer polynucleotide construct 90 as shown in FIG. 1e. If desired the last oligonucleotide in the assembly (e.g. constructionoligonucleotide 4) may be labeled with a fluorescent label so as toindicate that a full length construction has taken place.

In certain exemplary embodiments, a detectable label can be used todetect one or more oligonucleotides or polynucleotides described herein.Examples of detectable markers include various radioactive moieties,enzymes, prosthetic groups, fluorescent markers, luminescent markers,bioluminescent markers, metal particles, protein-protein binding pairs,protein-antibody binding pairs and the like. Examples of fluorescentproteins include, but are not limited to, yellow fluorescent protein(YFP), green fluorescence protein (GFP), cyan fluorescence protein(CFP), umbelliferone, fluorescein, fluorescein isothiocyanate,rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride,phycoerythrin and the like. Examples of bioluminescent markers include,but are not limited to, luciferase (e.g., bacterial, firefly, clickbeetle and the like), luciferin, aequorin and the like. Examples ofenzyme systems having visually detectable signals include, but are notlimited to, galactosidases, glucorimidases, phosphatases, peroxidases,cholinesterases and the like. Identifiable markers also includeradioactive compounds such as 125I, 35S, 14C, or 3H. Identifiablemarkers are commercially available from a variety of sources.

In some embodiments, the support comprises a set of features comprisingsupport-bound oligonucleotides complementary to constructionoligonucleotides and at least one feature comprising an anchorsupport-bound oligonucleotide. The anchor oligonucleotide is preferablysingle-stranded and comprises a sequence complementary to a terminussequence of the target polynucleotide. Referring to FIG. 2a , anoligonucleotide array 110 is shown which is similar that described inFIG. 1a , except that the support comprises on the same surface,support-bound oligonucleotides 130, and support-bound anchoroligonucleotides 140.

Referring to FIG. 2b , a primer 150 is hybridized to the support-boundoligonucleotides 130 under hybridizing conditions. In a first step, theprimer is annealed to a support-bound single-stranded oligonucleotideand extended with the addition of dNTPs and an appropriate polymeraseunder appropriate conditions and temperature, to form constructionoligonucleotides 160 (construction oligonucleotides 1, 2, 3 ,4) whichare complimentary to the support-bound oligonucleotides (1′, 2′ 3′, 4′).In some embodiments, the primer is a primer containing multiple uracil(U). In a subsequent step, the primer is removed. Preferably, an USERTMendonuclease is added to digest the primer. Digestion of the primers maytake place subsequent to the extension step, thereby generating a duplexcomprising the construction oligonucleotides hybridized to thesupport-bound oligonucleotides. Yet in other embodiments, digestion ofthe primers occurs in solution after release of the constructionoligonucleotides in solution (165, FIG. 2c ).

In a second step (FIG. 2c ), the newly synthesized extension products(construction oligonucleotide 1, 2, 3, and/or 4) are melted and releasedfrom the features (construction oligonucleotide 165). Dissociation ofthe construction oligonucleotides can be performed in parallel orsequentially. The construction oligonucleotides 165 may be released insolution. Melting of the duplexes may be performed by increasing thetemperature to a melting temperature (e.g. 95° C.). Alternatively, theduplexes may be dissociated using an enzyme capable of dissociating thedouble-stranded nucleic acids, such as an helicase. The first extensionproduct is allowed to hybridize under appropriate conditions to thefirst plurality of anchor oligonucleotides and the other extensionproducts (or overlapping oligonucleotides) are allowed to hybridizeunder appropriate conditions to their complementary sequences.

Referring to FIG. 2d , construction oligonucleotides (165) arehybridized to an anchor support-bound oligonucleotide 140 having asequence complementary to the first construction oligonucleotide(construction oligonucleotide 1). Additional constructionoligonucleotides (2, 3 , 4 etc.) are designed to hybridize to each otherthrough their overlapping regions, as shown in FIG. 2d , to form alonger nucleic acid construct 170. In preferred embodiments, the anchorsupport-bound oligonucleotide is single-stranded. In some embodiments,the anchor support-bound oligonucleotide comprises a 5′ terminuscomplementary to the 5′ terminus of a first plurality ofoligonucleotides. The additional construction oligonucleotides thattogether form the polynucleotide sequence comprise complementary 3′termini and hybridize to each other. FIG. 2d , shows an example ofoligonucleotides that have been designed to hybridize to each other toassemble into a longer polynucleotide construct built onto anchoroligonucleotide 140. Ligase 180 is introduced into solution tocovalently ligate each junction thus forming a covalently joined longerpolynucleotide construct 190 as shown in FIG. 2e . If desired the lastoligonucleotide in the assembly (construction oligonucleotide 4) may belabeled with a fluorescent label so as to indicate that a full lengthconstruction has taken place.

Aspects of the invention relate to the selection of targetpolynucleotides having the predefined sequence and/or to the removal ofundesired assembly products. During polynucleotides assembly fromoligonucleotides, undesired products such as partial lengthpolynucleotides, truncated constructs can be assembled. It can be usefulto remove undesired products or polynucleotides that do not have thecorrect sequence and/or length. In some embodiments, one or moreassembled polynucleotides may be sequenced to determine whether theycontain the predetermined sequence or not. This procedure allowsfragments with the correct sequence to be identified. In otherembodiments, other techniques known in the art may be used to removeerror containing nucleic acid fragments.

In some aspects of the invention, methods are provided for selectivelyprotecting target polynucleotide sequences from exonuclease digestionthereby facilitating the elimination of undesired constructs. Any of avariety of nucleases that preferably digest single-stranded nucleicacids can be used. Suitable nucleases include, for example, asingle-strand specific 3′ exonuclease, a single-strand specificendonuclease, a single- strand specific 5′ exonuclease, and the like. Incertain embodiments, the nuclease comprises E. coli Exonuclease I. Insome embodiments, the exonuclease digestion is performed to digest allnon-double-stranded sequences. Selection methods are illustrated inFIGS. 3a -3 e. In one embodiment, the selection method takes advantageof the terminus 3′ or 5′ overhang of the fully assembled product. Onewill appreciate that the single-stranded overhanging sequence (5′ or 3′)of the fully assembled product will correspond to the sequence of theterminal oligonucleotide (for example construction oligonucleotide 4, asdepicted in FIG. 3b ). In undesired products, the single stranded 3′ or5′ overhang sequence will have a sequence different from the predefinedterminal oligonucleotide. For example, as depicted in FIGS. 3a -3 e, theundesired products have a free overhang having constructionoligonucleotide 2, or 3 instead of terminal construction oligonucleotide4. As used herein the term “terminal oligonucleotide” or “terminalconstruction oligonucleotide” refers to the oligonucleotide at thetarget polynucleotide terminal sequence or terminal overhang. In someembodiments, the terminal oligonucleotide corresponds to the 3′ or the5′ single-stranded overhang of the target polynucleotide. In someembodiments, the target polynucleotide sequence comprises at least afirst oligonucleotide, and a terminal construction oligonucleotide,wherein the terminal oligonucleotide is downstream from the firstoligonucleotide. In some embodiments, the target polynucleotidecomprises a first oligonucleotide, a terminal oligonucleotide and atleast one internal oligonucleotide.

FIGS. 3a-3c illustrates the case wherein truncated or undesiredassemblies (72, and 74, FIG. 3b ) are generated as well as full lengthassemblies (70, FIG. 3b ). In order to filter out the undesiredassemblies, a nucleic acid hairpin structure or stem-loopoligonucleotide 200 can be added to the assembly product. The stem-loopoligonucleotide is designed to hybridize to 5′ or the 3′ overhang andligate to the two terminal oligonucleotides present in the full lengthassembly product 70. In addition, the stem-loop oligonucleotide isdesigned not to hybridize or ligate to truncation products 72 and 74.

The stem-loop structure may be formed by designing the oligonucleotidesto have complementary sequences within its single-stranded sequencewhereby the single-strand folds back upon itself to form adouble-stranded stem and a single-stranded loop. Preferably, thedouble-stranded stem domain has at least about 2 base pairs and thesingle stranded loop has at least 3 nucleotides. Preferably, the stemcomprises an overhanging single-stranded region (3′ or 5′), i.e., thestem is a partial duplex. For example, the overhang length can be fromabout 3 to about 10, to about 20, to about 50, etc. . . . nucleotides.In an exemplary embodiment, the overhang length of the stem-loopoligonucleotide is complementary to the 5′ or 3′ single-strandedoverhang of the fully assembled polynucleotide or target polynucleotide.

Referring to FIG. 3d , the stem-loop oligonucleotide is ligated to thefull length polynucleotide having the predefined sequence. Referring toFIG. 3e , the support surface is exposed to an exonuclease such as a 3′nuclease. In preferred embodiments, the stem-loop oligonucleotide servesto protect the overhang (3′ or 5′ overhang) of the full lengthpolynucleotide construct 70. The undesired constructs (e.g. truncatedconstructs) which did not hybridize/ligate to the stem-loopoligonucleotides are susceptible to digestion. After the digestion step(FIG. 3e ), the stem-loop oligonucleotide may be cleaved off the fulllength construct. For example, in some embodiments, the stem-loopoligonucleotide is designed to comprise a type II restriction site intothe stem structure of the stem-loop oligonucleotide and the stem-loopoligonucleotide is cleaved off the nucleic acid construct restrictionenzyme (e.g. type II restriction enzyme). In other embodiments, thestem-loop oligonucleotide is designed to comprise at least one Uraciland the stem-loop oligonucleotide is cleaved off the nucleic acidconstruct using a mixture of Uracil DNA glycosylase (UDG) and a DNAglycosylase-lyase Endonuclease VIII or a USERTM enzyme. In someembodiments, the full length polynucleotide may be cleaved off thesurface. In some embodiments, necessary restriction sites can bespecifically included in the design of the first plurality ofoligonucleotide and/or in the design of the anchor oligonucleotides. Insome embodiments, the restriction site is a type II restriction site. Insome embodiments, the full length construct may be subsequentlyamplified.

In some embodiments, the 3′ region of the anchor oligonucleotidecomprises a restriction enzyme site. In some embodiments, primers/primerbinding sites may be designed to include a restriction endonucleasecleavage site. In an exemplary embodiment, a primer/primer binding sitecontains a binding and/or cleavage site for a type IIs restrictionendonuclease. A wide variety of restriction endonucleases havingspecific binding and/or cleavage sites are commercially available, forexample, from New England Biolabs (Beverly, Mass.). In variousembodiments, restriction endonucleases that produce 3′ overhangs, 5′overhangs or blunt ends may be used. When using a restrictionendonuclease that produces an overhang, an exonuclease (e.g., RecJ_(f),Exonuclease I, Exonuclease T, S₁ nuclease, P₁ nuclease, mung beannuclease, T4 DNA polymerase, CEL I nuclease, etc.) may be used toproduce blunt ends. Alternatively, the sticky ends formed by thespecific restriction endonuclease may be used to facilitate assembly ofsubassemblies in a desired arrangement. In an exemplary embodiment, aprimer/primer binding site that contains a binding and/or cleavage sitefor a type IIs restriction endonuclease may be used to remove thetemporary primer. The term “type-IIs restriction endonuclease” refers toa restriction endonuclease having a non-palindromic recognition sequenceand a cleavage site that occurs outside of the recognition site (e.g.,from 0 to about 20 nucleotides distal to the recognition site). Type IIsrestriction endonucleases may create a nick in a double-stranded nucleicacid molecule or may create a double-stranded break that produces eitherblunt or sticky ends (e.g., either 5′ or 3′ overhangs). Examples of TypeIIs endonucleases include, for example, enzymes that produce a 3′overhang, such as, for example, Bsr I, Bsm I, BstF5 I, BsrD I, Bts I,Mnl I, BciV I, Hph I, Mbo II, Eci I, Acu I, Bpm I, Mme I, BsaX I, Bcg I,Bae I, Bfi I, TspDT I, TspGW I, Taq II, Eco57 I, Eco57M I, Gsu I, Ppi I,and Psr I; enzymes that produce a 5′ overhang such as, for example, BsmAI, Ple I, Fau I, Sap I, BspM I, SfaN I, Hga I, Bvb I, Fok I, BceA I,BsmF I, Ksp632 I, Eco31 I, Esp3 I, Aar I; and enzymes that produce ablunt end, such as, for example, Mly I and Btr I. Type-IIs endonucleasesare commercially available and are well known in the art (New EnglandBiolabs, Beverly, Mass.).

In other embodiments, the primer and/or primer biding sites comprises atleast on Uracil and the primer is cleaved off using a mixture of UracilDNA glycosylase (UDG) and a DNA glycosylase-lyase Endonuclease VIII or aUSERTM enzyme as provided t.

In some other embodiments, selection of target polynucleotides takesadvantage of the size or length of the desired target polynucleotide.FIGS. 4a-4c illustrates methods for measuring the length of apolynucleotide construct(s) on an array's surface and for selecting fulllength polynucleotide construct(s). Preferably, the methods allow forthe selection of the correct length target polynucleotide construct fromamongst a distribution of different polynucleotide construct lengths.One skilled in the art would appreciate that attachment chemistries fornucleic acids to glass surfaces typically result in nucleic acidsmolecule to molecule spacing (d) ranging from 1 to 15 nm, preferablyfrom 2 to 8 nm, preferably from 5 to 7 nm. In some embodiments, distanced is about 6 nm. Referring to FIG. 4b , a surface 215 may be preparedwhich has a first plurality of anchor oligonucleotides 240 and a secondplurality of anchor oligonucleotides 242 immobilized on the supportwherein the first and second pluralities of anchor oligonucleotides havea different predefined sequence. In some embodiments, the first andsecond pluralities of anchor oligonucleotides are separated by apredetermined distance X. In some embodiments, the distance X may be setand controlled by mixing in equal numbers of first and second anchormolecules with a third support-bound oligonucleotide sequence referredas a spacer oligonucleotide sequence 245. In some embodiments, thespacer oligonucleotide sequence is designed not to have complementarysequences with the construction oligonucleotides. In some embodiments,the spacer oligonucleotide is single-stranded. Yet, in otherembodiments, the spacer oligonucleotide is double-stranded. In someembodiments, the distance X is set using the following equation:

X˜d×(C[spacer]/C[anchor1+anchor2])

wherein d is the distance between two nucleic acid molecules , C[spacer]is the concentration of the spacer oligonucleotide, C[anchor1+anchor2]is the concentration of a mixture of anchor oligonucleotide 1 and anchoroligonucleotide 2.

Referring to FIG. 4a , construction oligonucleotides 265 are synthesizedby primer extension using support-bound oligonucleotides 230 astemplates. In some embodiments, a first support-bound anchoroligonucleotide 240 is designed to have a sequence complementary to afirst plurality of construction oligonucleotides. Constructionoligonucleotides 265 may hybridize to each other and hybridize to theanchor oligonucleotide thereby generating support-bound polynucleotideconstructs (270, 272, 273, 274, FIG. 4b ). After assembly reactions,some of these polynucleotide constructs (270) may be full lengthpolynucleotides having the predefined sequence, whereas otherpolynucleotide constructs (272, 273, 274) may be shorter than fulllength polynucleotide constructs. Taking into account that thenucleotides in a nucleic acid construct are spaced about 0.33 nm or 0.34nm apart, a nucleic acid construct comprising 1000 nucleotide bases (asper a typical gene length) will be about 340 nm in length. In someembodiments, a second support-bound anchor oligonucleotide 242 may bedesigned so that it can connect to the terminus or 5′ overhang of thefull length DNA construct 270, resulting in bound full length construct290. Furthermore if the distance X is set to be approximately the lengthexpected of the full length construct then the terminus or 5′-overhangof the full length construct should only bind to the second anchoroligonucleotide if a) the full length construct has the correct sequenceat the end, and if b) the full length construct has the correct length(290, as shown in FIG. 4c ). In some embodiments, the first anchor 240may comprise a type II endonuclease site. The full length product may becleaved using type II restriction endonuclease resulting in a productthat is anchored at the distal end to the second anchor 242. In someembodiments, the anchor sequence comprises at least one Uracil and thefull length product is cleaved using a mixture of Uracil DNA glycosylase(UDG) and a DNA glycosylase-lyase Endonuclease VIII or a USERTM enzyme.

Some aspects of the invention relate to devices and methods enablinghighly parallel support-bound oligonucleotides assembly. In someembodiments, an array of polynucleotides may be assembled on a surfaceby sequential addition of complementary overlapping oligonucleotides toa plurality of anchor oligonucleotides. In preferred embodiments, aplurality of polynucleotides having different pre-determined sequencesare synthesized at different features of an array. Referring to FIG. 5,an anchor array 310 is provided wherein each feature of the arraycomprises a support-bound anchor oligonucleotide (340: A₀, B₀, C₀, D₀)as described above. Each anchor oligonucleotide may be single-strandedand may comprise at its 5′ terminal a sequence complementary to the 5′terminus of a first plurality of oligonucleotides. One should appreciatethat, for highly parallel synthesis of different polynucleotides, eachplurality of anchor oligonucleotides at each feature needs to have asequence complementary to the 5′ end of the predetermined polynucleotidesequence to be synthesized. Accordingly, in some embodiments, the anchorarray comprises different populations of anchor oligonucleotides atdifferent features of the array, each population or plurality of anchoroligonucleotides having a different 5′ terminal sequence. Additionally,two or more construction arrays (311, 312, and 315) can be provided,wherein the construction arrays comprise a plurality of features, eachfeature comprising a different population of support-boundoligonucleotides (A_(n), B_(n), C_(n), having a predefined sequence. Itshould be appreciated that, in some embodiments, each feature comprisesa plurality of support-bound oligonucleotides having a pre-determinedsequence different than the sequence of the plurality of support-boundoligonucleotides from another feature on the same surface. Theoligonucleotide sequences can differ by one or more bases. ReferringFIG. 5, the first array 311 comprises a first plurality of support-boundoligonucleotides 331 (A₁′, B₁′, C₁′, D₁′), wherein part of the sequenceof each of the first plurality oligonucleotides is identical to the 5′end of the anchor oligonucleotides 314 attached to a feature of theanchor array. The second array 312 comprises a second plurality ofsupport-bound oligonucleotides 332 (A₂′, B₂′ C₂′ D₂′), wherein the 5′end of each of the second plurality of oligonucleotides is complementaryto 5′ end of a first plurality of oligonucleotides 331. One shouldappreciate that depending on the polynucleotides' sequence and thelength to be assembled, one or more (for example, m) additional arraysmay be provided, each array comprising a plurality of support-boundoligonucleotides (A′_(m-1), B′_(m-1), C′_(m-1), D′_(m-1)) having a free5′ end complementary to the 5′ end of another plurality of support-boundoligonucleotides 335 (A′, B′_(m), C′_(m), D′_(m)). In some embodiments,each plurality of complementary oligonucleotides is provided on adifferent support.

Referring to FIG. 6, a first plurality of complementary oligonucleotides(construction oligonucleotides A₁, B₁, C₁, D₁) are generated using thefirst plurality of support-bound oligonucleotides 331 (A₁′, B₁′, Cr,D₁′) as templates. In some embodiments, one or more support-boundoligonucleotides are incubated with a primer in presence of a polymeraseunder conditions promoting primer extension. In some embodiments, thefirst plurality of support-bound oligonucleotides 331 (A₁′, B₁′, C₁′,D₁′) has at its 3′ end a sequence designed to be complementary to theprimer sequence (e.g. primer binding site). In some embodiments, theprimer is a primer containing multiple uracil and after extension theprimer is removed using an USERTM endonuclease as described herein.Similarly, construction oligonucleotides A₂, B₂, C₂, D₂ and A_(m),B_(m), C_(m), D_(m) can be synthesized in a parallel or sequentialfashion, thereby generating support-bound double-strandedoligonucleotides (e.g. 361: A₁A₁′, etc. . . . ).

As shown on FIGS. 7a -7 b, construction oligonucleotides can be releasedfrom the construction array. In some embodiments, the constructionoligonucleotides are released under conditions promoting dissociation ofthe duplexes (e.g. for example under melting temperatures or in presenceof an helicase as provided herein). FIGS. 7a-7b illustrates parallelsynthesis of a plurality of polynucleotides having differentpre-determined sequences by sequential addition of complementaryoverlapping construction oligonucleotides. Referring to FIG. 7a , afirst set of different construction oligonucleotides 371 (A₁, B₁, C₁,D₁) are released from construction array 311. In some embodiments, theconstruction oligonucleotides are transferred and annealed to an anchorarray 310 comprising anchor oligonucleotides (A₀, B₀, C₀, D₀) having attheir 5′ end a sequence complementary to the sequence at the 5′ end ofthe first set of construction oligonucleotides 371, thereby formingduplexes 381 (e.g. A₀A₁, B₀B₁, C₀C₁, D₀D₁). In preferred embodiments,the first plurality of duplex comprises a 3′ free overhang. Asillustrated in FIG. 7b , a second population of different constructionoligonucleotides 372 (A₂, B₂, C₂, D₂) having at one end a sequencecomplementary to the first plurality of duplex overhang are releasedfrom construction array 312. The second population is annealed to thefree 3′ overhang of the anchor-first construction oligonucleotidesduplexes 381 attached the anchor array 310 to form duplexes 382 (A₀A₁A₂,B₀B₁B₂, C₀C₁C₂,D₀D₁D₂). In some embodiments, the second plurality ofduplexes comprises a 5′ free overhang. In some embodiments, a thirdpopulation of construction oligonucleotides designed to have a sequencecomplementary to the 5′ overhang is annealed to the 5′ free overhang ofthe second plurality of duplexes. Such process can be repeated withadditional construction oligonucleotides generated and released fromconstruction arrays until the desired length and sequence of eachpolynucleotide has been synthesized. In some embodiments, the internalconstruction oligonucleotides are designed to have a sequence region attheir 3′ end complementary to the sequence region at the 3′ end of anext internal construction oligonucleotide. In some embodiments, eachpopulation of construction oligonucleotides is synthesized fromdifferent construction arrays and is designed to hybridize to each otherto assemble into a longer polynucleotide having a predefined sequence.In some embodiments, the construction oligonucleotide corresponding tothe 5′ end of the desired polynucleotide has a sequence complementary toa support-bound anchor oligonucleotide. In some embodiments, theconstruction oligonucleotides are joined using a ligase. In someembodiments, each construction oligonucleotide is annealed to anoverhang and the construction oligonucleotides defining one strand ofthe double-stranded target polynucleotide can be ligated. Constructionoligonucleotides may be ligated after each sequential addition of aconstruction oligonucleotide or may be ligated once the constructionoligonucleotides have annealed to each other to form the full lengthpolynucleotide.

In some aspects of the invention, construction oligonucleotides aresequentially transferred from a construction array to an anchor array ina highly parallel fashion. In some embodiments, a plurality ofpolynucleotides are assembled on a anchor array by sequential alignment,transfer and addition of complementary overlapping oligonucleotides to aplurality of anchor oligonucleotides. In some embodiments, theconstruction array and the anchor array are brought into close proximityto allow the transfer of construction oligonucleotides from theconstruction array to the anchor array. Preferably, the constructionarray is brought to a distance substantially comparable or a distancesmaller than the distance between two sets of oligonucleotides (A₁, B₁etc . . . ). In some embodiments, the distance between the constructionarray and the anchor array is from about 10 μm to about 1000 μm. Desireddistances within this range are achieved, in some embodiments, by use ofa dilution of spacer spheres (for example, available from CosphericMicrospheres) to a monolayer which keep the two arrays apart undercompression. In other embodiments, silicone membranes, for examplepolydimethylsiloxane (PDMS), is fabricated to encompass one of thearrays in a thin chamber which seals upon bringing the second array tothe height of the membrane. In other embodiments, one array can float ontop of the other using the liquid medium itself as a spacer. Forexample, 100 μliters of fluid medium or solution has a thickness ofapproximately 50 microns when spread evenly over a standard microscopeslide can be used.

In some embodiments, the plurality of construction oligonucleotides aresynthesized onto at least one array as described above. Referring toFIG. 8, a plurality of construction oligonucleotides are synthesized atselected features of at least one construction array (e.g. surface 411),each plurality of oligonucleotides (e.g. A₁, B₁C₁D₁) having a differentpredefined sequence. In some embodiments, a plurality of constructionoligonucleotides are synthesized at selected features of a plurality ofconstruction supports (e.g. surface 411, 412, 415), each plurality ofoligonucleotides having a different predefined sequence. According tosome embodiments, construction oligonucleotides are synthesized byprimer extension using support-bound oligonucleotides as templates. Insome embodiments, a first plurality of oligonucleotides (e.g.oligonucleotides 461 on support 411) may be incubated with a primer inpresence of a polymerase, under conditions promoting primer extension.The first plurality of support-bound oligonucleotides can have at its 3′end a sequence designed to be complementary to a primer sequence (e.g.primer binding site). In some embodiments, the primer is a primercontaining multiple uracil (e.g. USER™ cleavable primers) and afterprimer extension the primer is removed using USERTM endonuclease asdescribed herein. Similarly, construction oligonucleotides A₂, B₂ C₂ D₂, A_(m), B_(m) C_(m) ,D_(m) having a predefined sequence can besynthesized in parallel or sequential fashion by primer extensionthereby forming duplexes. As illustrated in FIG. 8, this results in aplurality of arrays having on their surface (411, 12, 415) a pluralityof duplexes comprising the construction oligonucleotides and thetemplate oligonucleotides (oligonucleotides 461:A₁, B₁ C₁ D₁ on surface411; oligonucleotides 462 A₂, B₂ C₂ D₂ on surface 412; oligonucleotides465 A, B_(m) C_(m) ,D_(m) on surface 415).

Each of the plurality of construction array may be designed to have thesame configuration, each feature being separated from the next featureby the same distance and each feature being similarly arranged on thearray. For example, the first support 411 has n features comprising afirst, a second and a n^(th) population of oligonucleotides,respectively, each oligonucleotide having a predefined sequence. Onewould appreciate that each plurality of oligonucleotides can differ fromthe other plurality of oligonucleotides by one or more bases. Similarly,the support 412 has a first, a second and a n^(th) population ofoligonucleotides wherein the first population of oligonucleotides of thefirst support has sequence complementary to the first population ofoligonucleotides of the second support (as illustrated in FIG. 9c ). Inan exemplary embodiment, the first population of oligonucleotides of thefirst support has a 3′ end sequence complementary to the firstpopulation of oligonucleotides of the second support. Similarly, them^(th) population of oligonucleotides of the m^(th) support has asequence region complementary to the (m-1)^(th) population ofoligonucleotides of the (m-1)^(th) support.

In some embodiments, a first construction array is aligned to an anchorarray 410 wherein each feature comprises a support-bound anchoroligonucleotide (A₀, B₀, C₀, D₀). Each anchor oligonucleotide can besingle-stranded and can comprise at its 5′ end a sequence complementaryto the 5′ terminus of a plurality of oligonucleotides of the firstconstruction array. One should appreciate that for the highly parallelsynthesis of a plurality of polynucleotides having a differentpredefined sequence, each plurality of anchor oligonucleotides can havea different 5′ end sequence. In some embodiments, the different 5′ endsequence can differ by one or more bases. In some embodiments, theconstruction array and the anchor array are aligned vertically, theconstruction array defining to a top array and the anchor array definingto a bottom array. In preferred embodiments, the anchor array and theconstruction arrays are designed to have the same configuration, eachfeature being separated from the next feature by the same distance andeach feature being similarly arranged on the array. One shouldappreciate that the design of the construction and anchor arrays enablesthe alignment of the anchor and the construction oligonucleotides, theanchor oligonucleotides having a sequence complementary to theconstruction oligonucleotides. After alignment of the construction andanchor arrays, the construction oligonucleotides may be released insolution resulting in the capture and hybridization of the constructionoligonucleotides to the anchor oligonucleotides. A second population ofconstruction oligonucleotides immobilized onto a different support canthen be brought into close proximity to the anchor array and addedsequentially to the duplex comprising the anchor oligonucleotide and thefirst population of construction oligonucleotides.

The first construction array can be aligned and approximated to theanchor array. In some, the alignment and approximation of theconstruction array and anchor array is in presence of a fluid medium orsolution which allows for the subsequent proximal diffusion and transferof construction oligonucleotides from the top construction array to thebottom anchor array (illustrated in a vertical direction in FIGS. 9a-9b). Construction oligonucleotides can be released from the constructionarray in a fluid medium, for example under conditions promotingdissociation of duplexes. For example, the duplexes 461 can bedissociated by heating selected features or the entire array at atemperature above the melting temperature of the constructionoligonucleotide duplexes. Referring to FIGS. 9a -9 b, a first set ofconstruction oligonucleotides having different predefined sequences (A₁,B₁, C₁, D₁) are released from the duplexes 461 on the first constructionarray in a fluid medium 485 and captured onto the anchor array forming aplurality of duplexes (e.g. duplexes 441, A₀A₁, B₀B₁, C₀C₁, D₀D₁).Referring to FIGS. 9c -9 d, a second construction array is aligned andbrought into close proximity to the anchor array comprising theanchor-first population construction oligonucleotide duplexes. Alignmentand approximation of the second construction array and the anchor arrayin presence of a fluid medium allows for the subsequent proximaltransfer of a second set of construction oligonucleotides from thesecond construction array to the anchor array. Referring to FIG. 9d ,the second set of construction oligonucleotides A₂, B₂, C₂, D₂ , 462 arereleased from the microarray 412 and are annealed to anchor microarray410 to form polynucleotides 442 A₀A₁A₂, B₀B₁B₂ C₀C₁C₂, D₀D₁D₂. Theresulting assembled polynucleotides may then be ligated by including aligase, for example Taq DNA ligase and its necessary reactioncomponents, in the fluid medium to form single covalently linkedmolecules. In some embodiments, the ligase may be supplemented with anon-strand displacing DNA polymerase to fill in gaps and increase theefficiency of ligation. Such process may be repeated with additionalmembers of the construction microarray ensemble until polynucleotides ofdesired length have been synthesized on the anchor oligonucleotidearray.

In some embodiments, error correction may be included between eachprocess repetition and/or at the end of the assembly process to increasethe relative population of synthesized polynucleotides without deviationfrom the desired sequences. Such error correction may include directsequencing and/or the application of error correcting enzymes such aserror correcting nucleases (e.g. CEL I), error correction based on MutSor MutS homologs binding or other mismatch binding proteins, other meansof error correction as known in the art or any combination thereof. Inan exemplary embodiment, CEL I may be added to the oligonucleotideduplexes in the fluid medium. CEL I is a mismatch specific endonucleasethat cleaves all types of mismatches such as single nucleotidepolymorphisms, small insertions or deletions. Addition of the CEL Iendonuclease results in the cleavage of the double-strandedoligonucleotides at the site or region of the mismatch. FIG. 9e depictsan anchor array 410 having on its surface a plurality of polynucleotides452 which have been assembled by means of the process described in FIGS.9a-d . The assembled polynucleotides may contain one or more sequenceerrors 500 (illustrated by a cross). An error correcting nuclease suchas CEL I may be used to cleave the double-stranded polynucleotide atsuch errors sites resulting in cleaved polynucleotides 453 as shown inFIG. 9 f.

In some embodiments, the alignment and approximation of the firstconstruction array to the anchor array is in presence of a fluid mediumand a porous membrane. According to some embodiments, a porous membraneis placed between the construction array and the anchor array to limitthe lateral diffusion of the construction oligonucleotides in the fluidmedium towards non-selected features of the anchor array (FIGS. 10a-10b). One should appreciate that the permeable membrane can constraindiffusion of construction oligonucleotide primarily in the verticaldirection thus decreasing lateral diffusion of constructionoligonucleotides towards non-corresponding anchor oligonucleotides. Forexample, the membrane can be a porous polymer membrane with pores ofuniform size. In some embodiments, the membrane has pore sizessufficient for the relatively free passage of nucleic acids. In someembodiments, the pore size can range from about 10 nm to about 100 nm ormore. Preferably, the pore size is not greater than about the distancebetween different oligonucleotides A₁,B₁ etc. In a preferred embodiment,the pore fill factor or aperture ratio is as large as possible. Suitablemembranes include those described in: Polymer Membranes withTwo-Dimensionally Arranged Pores Derived from Monolayers of SilicaParticles, Feng Yan and Werner A. Goedel Chem. Mater., 2004, 16 (9), pp1622-1626.

One should appreciate that, for certain situations, it is beneficial toensure that a high percentage of the available sites on the anchor arraycaptures the construction oligonucleotides, instead of being leftunfilled due to recapture of the construction oligonucleotides by theconstruction array. In order to increase the probability of capture bythe anchor array, covalent bonding of at least some of the constructionoligonucleotides to the anchor array may be carried out. Referring toFIGS. 11a -11 b, a modification of the process detailed in FIGS. 9a-9fis depicted. FIGS. 11a-11b illustrate the alignment and approximation ofa first construction array comprising at least two different overlappingconstruction oligonucleotides to an anchor array in a fluid medium orsolution comprising a ligase and the necessary reaction components andthe subsequent proximal transfer of the construction oligonucleotidesfrom the construction array to the anchor array. In some embodiments,each feature on a construction oligonucleotide array is designed tocontribute two or more overlapping oligonucleotides to be captured by apopulation of anchor oligonucleotide on a selected feature of the anchorarray. Referring to FIG. 11a , each feature of the constructionoligonucleotide array 420 carries two construction oligonucleotides 481(e.g. A₁ and A₂) per anchor oligonucleotide (e.g. A₀). The constructionoligonucleotides A₁ and A₂ may be released into the fluid media orsolution 485 residing between construction array 420 and anchor array410. In some embodiments, the solution further comprises a ligase 180such that when construction oligonucleotides A₁ and A₂ assemble ontoanchor A₀ (FIG. 11b ), oligonucleotide A₂ is covalently ligated toanchor A₀. This arrangement and the presence of ligase provide for thepreferred capture of construction oligonucleotides by the anchoroligonucleotide on the anchor array.

In some embodiments, the construction oligonucleotide array(s) and theanchor oligonucleotide array are designed such as the number ofconstruction oligonucleotides to be transferred to the anchor array isin stochiometric excess to each corresponding anchor oligonucleotide.This design allows for a substantially higher probability that theconstruction oligonucleotides be captured by each of the anchoroligonucleotides, thereby increasing the stepwise yield in the synthesisof the predefined polynucleotides.

FIGS. 12a-12b depicts the alignment and approximation of a firstconstruction array to an anchor array in a fluid medium and thesubsequent proximal transfer of construction oligonucleotides from theconstruction array to the anchor array, in which the number ofconstruction oligonucleotides is in stochiometric excess to eachcorresponding anchor oligonucleotide. Referring to FIG. 12a , the anchorarray 430 is designed such that it comprises stoichiometrically feweranchor oligonucleotides (A₀, B₀, C₀, D₀) as compared to the number ofconstruction oligonucleotides provided by construction array 411 foreach corresponding anchor oligonucleotide. This design ensures that thebinding of the construction oligonucleotides to each anchoroligonucleotide on the anchor array is stoichiometrically favored.Referring to FIG. 12a and for illustrative purposes, three constructionoligonucleotides are depicted at each feature of the construction array411 for each anchor oligonucleotide on anchor array 430. Theconstruction oligonucleotide array 411 is aligned in relation to theanchor oligonucleotide array 430. Focusing on the alignment ofconstruction oligonucleotide A₁ with anchor oligonucleotide A₀, afterdissociation construction oligonucleotide A₁ from its correspondingtemplate oligonucleotide, each of the three copies of A₁ has fourpotential binding sites: each copy of A₁ can bind back or be recapturedby the construction array 411 (3 potential sites) or bind to the anchoroligonucleotide A₀. After capture, one of the four potential bindingsites will remain empty. Since it is equally likely for each of thebinding sites to remain empty, the probability that A₀ remains empty is25%, and the probability that A_(o) is occupied is 75%. In order toincrease the binding probability of construction oligonucleotides to theanchor oligonucleotides even further, the ratio of constructionoligonucleotides to anchor oligonucleotides may be even further skewed.In some embodiments, the ratio of construction oligonucleotides toanchor oligonucleotides is at least 10:1, at least 100:1, at least1000:1, at least 10⁴:1, at least 10⁵:1, at least 10⁶:1.

One should appreciate that a method for increasing the efficiency ofconstruction of desired polynucleotides is to reduce the number of stepsin the construction process. In some embodiments, the polynucleotidesare synthesized using a hierarchical construction method, where multipleanchor arrays, after several rounds of transfer from constructionarrays, may be used themselves as construction arrays in followingsteps. A hierarchical process geometrically reduces the time to executethe same number of transfers, as well as the number of transfers done oneach anchor array, accordingly reducing the impact of stepwise loss.FIG. 13a illustrates two anchor arrays 470 and 471 which have undergonea number of transfers from construction arrays, resulting in surfaceattached synthesized polynucleotides 472 and 473, respectively. Asdepicted, one strand of the synthesized polynucleotides has been ligatedto the original anchors from said anchor arrays, such that for example,the length of synthesized polynucleotide is longer than that of A₀ (ifthe original anchor array is similar to 410 or 430 from FIGS. 11a-11b orFIGS. 12a -12 b, respectively). A release of the polynucleotide strandsnot ligated to the anchor array results in the transfer ofpolynucleotides between the arrays as shown in FIG. 13b . The presenceof a ligase and the necessary ligation reaction components results inthe covalent linkage of the polynucleotides together. One should notethat, for illustration purposes, the transfer of polynucleotides isshown from anchor array 471 to anchor array 470, although the transferwill be distributed between both arrays. In order to reduce overallerror rate, the surface immobilized synthesized polynucleotides 472 and473 may first be exposed to a error correcting nuclease as described inthe description of FIG. 9e-f . Since some error correcting nucleasescleave at the junction of double and single-stranded nucleic acid,polynucleotides 472 and 473, can be designed to be fully double-strandedor may be converted to double-stranded by adding additional gap fillingoligonucleotides to the polynucleotides 472 and 473 followed byligation.

It should be appreciated that the description of the assembly reactionsin the context of oligonucleotides is not intended to be limiting. Forexample, other polynucleotides (e.g., single-stranded, double-strandedpolynucleotides, restriction fragments, amplification products,naturally occurring polynucleotides, etc. . . . ) may be included in anassembly reaction, along with one or more oligonucleotides, in order togenerate a polynucleotide of interest.

Aspects of the invention may be useful for a range of applicationsinvolving the production and/or use of synthetic nucleic acids. Asdescribed herein, the invention provides methods for producing syntheticnucleic acids with increased fidelity and/or for reducing the costand/or time of synthetic assembly reactions. The resulting assemblednucleic acids may be amplified in vitro (e.g., using PCR, LCR, or anysuitable amplification technique), amplified in vivo (e.g., via cloninginto a suitable vector), isolated and/or purified.

Aspects of the methods and devices provided herein may comprise removalof error-containing nucleic acid sequences as described herein.Error-free nucleic acid sequences can be enriched by removal oferror-containing sequences or error-containing nucleotide(s). Thenucleic acid sequences can be construction oligonucleotides, orassembled products, such as subassemblies or final desiredpolynucleotides. In some embodiments, the nucleic acid sequences may bereleased in solution from the support using methods known in the art,such as for example, enzymatic cleavage or amplification. This step cantake place in localized individual microvolume(s) containing only theregion(s) or feature(s) of interest or in single volume. In someembodiments, removal of the error-containing nucleic acid sequences areperformed within a microdroplet. The nucleic acid sequences may be anydouble-stranded polynucleotide having a predefined sequence.Amplification may be carried out at one or more stages during anassembly process resulting in a pool of double-stranded oligonucleotidesor assembled products. One would appreciate that such pool may compriseheteroduplexes (double-stranded nucleic acids sequences having one ormore sequence errors) and homoduplexes (error free double-strandednucleic acid sequences or double-stranded nucleic acid sequences havingcomplementary sequences errors). As illustrated in FIG. 14a , thedouble-stranded nucleic acids may contain one or more sequence errors(heteroduplexes illustrated by a cross). CEL nucleases (CEL I or CEL II)are mismatch—specific endonucleases known to cut double-stranded nucleicacid in both strands at sites of single-base substitution, smalldeletion or small insertion. CEL I cleaved the nucleic acid on the 3′side of the mismatch site, generating a single-stranded 3′ overhang ofone or more nucleotides. In some embodiments, the endonuclease CEL I(SURVEYOR®) can be used to cleave the double-stranded nucleic acids atsuch errors sites resulting in a pool of nucleic acid sequencescomprising cleaved error-containing nucleic acids having a 3′ overhang,and error-free homoduplexes as illustrated in FIG. 14c . The mismatchnucleotide(s) can be removed using a T4 polymerase and/or a Klenowpolymerase having a 3′-5′ exonuclease activity, thereby generating asubstantially error-free double-stranded nucleic acids pool(homoduplexes as shown in FIG. 14d ). In some embodiments, the pool ofnucleic acid sequences can then amplified, such as by polymerase chainreaction, PCR), using end primers (FIG. 14e ). Primers may be universalprimers, semi-universal primers or primer specific to the terminalsequence of the nucleic acid molecule. In some embodiments, duplexes arefirst allowed to dissociate and re-anneal before subjecting the pool ofnucleic acids to amplification. This process will allow for detectionand removal of complementary errors that may have remained undetected bythe mismatch-specific endonucleases.

An assembled nucleic acid (alone or cloned into a vector) may betransformed into a host cell (e.g., a prokaryotic, eukaryotic, insect,mammalian, or other host cell). In some embodiments, the host cell maybe used to propagate the nucleic acid. In certain embodiments, thenucleic acid may be integrated into the genome of the host cell. In someembodiments, the nucleic acid may replace a corresponding nucleic acidregion on the genome of the cell (e.g., via homologous recombination).Accordingly, nucleic acids may be used to produce recombinant organisms.In some embodiments, a target nucleic acid may be an entire genome orlarge fragments of a genome that are used to replace all or part of thegenome of a host organism. Recombinant organisms also may be used for avariety of research, industrial, agricultural, and/or medicalapplications.

In some embodiments, methods described herein may be used during theassembly of large nucleic acid molecules (for example, larger than 5,000nucleotides in length, e.g., longer than about 10,000, longer than about25,000, longer than about 50,000, longer than about 75,000, longer thanabout 100,000 nucleotides, etc.). In an exemplary embodiment, methodsdescribed herein may be used during the assembly of an entire genome (ora large fragment thereof, e.g., about 10%, 20%, 30%, 40%, 50%, 60%, 70%,80%, 90%, or more) of an organism (e.g., of a viral, bacterial, yeast,or other prokaryotic or eukaryotic organism), optionally incorporatingspecific modifications into the sequence at one or more desiredlocations.

Aspects of the methods and devices provided herein may includeautomating one or more acts described herein. In some embodiments, oneor more steps of an amplification and/or assembly reaction may beautomated using one or more automated sample handling devices (e.g., oneor more automated liquid or fluid handling devices). Automated devicesand procedures may be used to deliver reaction reagents, including oneor more of the following: starting nucleic acids, buffers, enzymes(e.g., one or more ligases and/or polymerases), nucleotides, salts, andany other suitable agents such as stabilizing agents. Automated devicesand procedures also may be used to control the reaction conditions. Forexample, an automated thermal cycler may be used to control reactiontemperatures and any temperature cycles that may be used. In someembodiments, a scanning laser may be automated to provide one or morereaction temperatures or temperature cycles suitable for incubatingpolynucleotides. Similarly, subsequent analysis of assembledpolynucleotide products may be automated. For example, sequencing may beautomated using a sequencing device and automated sequencing protocols.Additional steps (e.g., amplification, cloning, etc.) also may beautomated using one or more appropriate devices and related protocols.It should be appreciated that one or more of the device or devicecomponents described herein may be combined in a system (e.g., a roboticsystem) or in a micro-environment (e.g., a micro-fluidic reactionchamber). Assembly reaction mixtures (e.g., liquid reaction samples) maybe transferred from one component of the system to another usingautomated devices and procedures (e.g., robotic manipulation and/ortransfer of samples and/or sample containers, including automatedpipetting devices, micro-systems, etc.). The system and any componentsthereof may be controlled by a control system.

Accordingly, method steps and/or aspects of the devices provided hereinmay be automated using, for example, a computer system (e.g., a computercontrolled system). A computer system on which aspects of the technologyprovided herein can be implemented may include a computer for any typeof processing (e.g., sequence analysis and/or automated device controlas described herein). However, it should be appreciated that certainprocessing steps may be provided by one or more of the automated devicesthat are part of the assembly system. In some embodiments, a computersystem may include two or more computers. For example, one computer maybe coupled, via a network, to a second computer. One computer mayperform sequence analysis. The second computer may control one or moreof the automated synthesis and assembly devices in the system. In otheraspects, additional computers may be included in the network to controlone or more of the analysis or processing acts. Each computer mayinclude a memory and processor. The computers can take any form, as theaspects of the technology provided herein are not limited to beingimplemented on any particular computer platform. Similarly, the networkcan take any form, including a private network or a public network(e.g., the Internet). Display devices can be associated with one or moreof the devices and computers. Alternatively, or in addition, a displaydevice may be located at a remote site and connected for displaying theoutput of an analysis in accordance with the technology provided herein.Connections between the different components of the system may be viawire, optical fiber, wireless transmission, satellite transmission, anyother suitable transmission, or any combination of two or more of theabove.

Each of the different aspects, embodiments, or acts of the technologyprovided herein can be independently automated and implemented in any ofnumerous ways. For example, each aspect, embodiment, or act can beindependently implemented using hardware, software or a combinationthereof. When implemented in software, the software code can be executedon any suitable processor or collection of processors, whether providedin a single computer or distributed among multiple computers. It shouldbe appreciated that any component or collection of components thatperform the functions described above can be generically considered asone or more controllers that control the above-discussed functions. Theone or more controllers can be implemented in numerous ways, such aswith dedicated hardware, or with general purpose hardware (e.g., one ormore processors) that is programmed using microcode or software toperform the functions recited above.

In this respect, it should be appreciated that one implementation of theembodiments of the technology provided herein comprises at least onecomputer-readable medium (e.g., a computer memory, a floppy disk, acompact disk, a tape, etc.) encoded with a computer program (i.e., aplurality of instructions), which, when executed on a processor,performs one or more of the above-discussed functions of the technologyprovided herein. The computer-readable medium can be transportable suchthat the program stored thereon can be loaded onto any computer systemresource to implement one or more functions of the technology providedherein. In addition, it should be appreciated that the reference to acomputer program which, when executed, performs the above-discussedfunctions, is not limited to an application program running on a hostcomputer. Rather, the term computer program is used herein in a genericsense to reference any type of computer code (e.g., software ormicrocode) that can be employed to program a processor to implement theabove-discussed aspects of the technology provided herein.

It should be appreciated that in accordance with several embodiments ofthe technology provided herein wherein processes are stored in acomputer readable medium, the computer implemented processes may, duringthe course of their execution, receive input manually (e.g., from auser).

Accordingly, overall system-level control of the assembly devices orcomponents described herein may be performed by a system controllerwhich may provide control signals to the associated nucleic acidsynthesizers, liquid handling devices, thermal cyclers, sequencingdevices, associated robotic components, as well as other suitablesystems for performing the desired input/output or other controlfunctions. Thus, the system controller along with any device controllerstogether form a controller that controls the operation of a nucleic acidassembly system. The controller may include a general purpose dataprocessing system, which can be a general purpose computer, or networkof general purpose computers, and other associated devices, includingcommunications devices, modems, and/or other circuitry or components toperform the desired input/output or other functions. The controller canalso be implemented, at least in part, as a single special purposeintegrated circuit (e.g., ASIC) or an array of ASICs, each having a mainor central processor section for overall, system-level control, andseparate sections dedicated to performing various different specificcomputations, functions and other processes under the control of thecentral processor section. The controller can also be implemented usinga plurality of separate dedicated programmable integrated or otherelectronic circuits or devices, e.g., hard wired electronic or logiccircuits such as discrete element circuits or programmable logicdevices. The controller can also include any other components ordevices, such as user input/output devices (monitors, displays,printers, a keyboard, a user pointing device, touch screen, or otheruser interface, etc.), data storage devices, drive motors, linkages,valve controllers, robotic devices, vacuum and other pumps, pressuresensors, detectors, power supplies, pulse sources, communication devicesor other electronic circuitry or components, and so on. The controlleralso may control operation of other portions of a system, such asautomated client order processing, quality control, packaging, shipping,billing, etc., to perform other suitable functions known in the art butnot described in detail herein.

Various aspects of the present invention may be used alone, incombination, or in a variety of arrangements not specifically discussedin the embodiments described in the foregoing and is therefore notlimited in its application to the details and arrangement of componentsset forth in the foregoing description or illustrated in the drawings.For example, aspects described in one embodiment may be combined in anymanner with aspects described in other embodiments.

Use of ordinal terms such as “first,” “second,” “third,” etc., in theclaims to modify a claim element does not by itself connote anypriority, precedence, or order of one claim element over another or thetemporal order in which acts of a method are performed, but are usedmerely as labels to distinguish one claim element having a certain namefrom another element having a same name (but for use of the ordinalterm) to distinguish the claim elements.

Also, the phraseology and terminology used herein is for the purpose ofdescription and should not be regarded as limiting. The use of“including,” “comprising,” or “having,” “containing,” “involving,” andvariations thereof herein, is meant to encompass the items listedthereafter and equivalents thereof as well as additional items.

EQUIVALENTS

The present invention provides among other things novel methods anddevices for high-fidelity gene assembly. While specific embodiments ofthe subject invention have been discussed, the above specification isillustrative and not restrictive. Many variations of the invention willbecome apparent to those skilled in the art upon review of thisspecification. The full scope of the invention should be determined byreference to the claims, along with their full scope of equivalents, andthe specification, along with such variations.

INCORPORATION BY REFERENCE

Reference is made to U.S. provisional application No. 61/412,937. filedNov. 12, 2010 entitled “Methods and Devices for Nucleic AcidsSynthesis”; U.S. provisional application No. 61/418,095 filed Nov. 30,2010 entitled “Methods and Devices for Nucleic Acids Synthesis” and U.S.provisional application No. 61/466,814, filed Mar. 23, 2011, entitled“Methods and Devices for Nucleic Acids Synthesis”, to PCT applicationPCT/US2009/55267 filed Aug. 27, 2009, PCT Application PCT/US2010/055298filed Nov. 3, 2010 and PCT Application PCT/US2010/057405 filed Nov. 19,2010. All publications, patents and sequence database entries mentionedherein are hereby incorporated by reference in their entirety as if eachindividual publication or patent was specifically and individuallyindicated to be incorporated by reference.

1-41. (canceled)
 42. A nucleic acid array system comprising: a. one ormore solid support; b. a plurality of discrete features associated withthe one or more solid support wherein each feature comprises a pluralityof support-bound oligonucleotides having a predefined sequence wherein afirst support-bound oligonucleotide comprises a terminal region that iscomplementary to a terminal region of a second support-boundoligonucleotide, wherein a Nth support-bound oligonucleotide comprises aterminal sequence complementary to a terminal sequence region of a(N-1)th support-bound oligonucleotide, wherein N is no less than 3; andc. at least a first plurality of anchor oligonucleotides comprising aterminal sequence that is identical to a sequence region of the firstsupport-bound oligonucleotide.
 43. The nucleic acid array system ofclaim 42 further comprising a second plurality of support-bound anchoroligonucleotide having a sequence region that is identical to the Nthsupport-bound oligonucleotide. 44-65. (canceled)
 66. The nucleic acidarray system of claim 42 wherein the first to the Nth support-boundoligonucleotides together comprise a target nucleic acid having apredetermined sequence.
 67. The nucleic acid array system of claim 66wherein each solid support comprises a plurality of features forassembling two or more target nucleic acids in parallel.
 68. The nucleicacid array system of claim 42 further comprising an anchor supportwherein the anchor oligonucleotides are immobilized on the anchorsupport.
 69. The nucleic acid array system of claim 68 furthercomprising a microvolume of reagents localized on each feature forsynthesizing construction oligonucleotides in a chain extension reactionusing the support-bound oligonucleotides as template.
 70. The nucleicacid array system of claim 69 further comprising a fluid medium fortransferring the construction oligonucleotides to the anchor support.71. A system for producing at least one target nucleic acid, comprising:a. one or more solid supports having at least a first and a secondfeatures thereon, the first feature having a first plurality ofsupport-bound oligonucleotides, the second feature having a secondplurality of support-bound oligonucleotides, wherein each of the firstplurality of support-bound oligonucleotides comprises a terminalsequence region complementary to that of the second plurality ofsupport-bound oligonucleotides; b. an anchor support comprising aplurality of anchor oligonucleotides, each having a terminal regionidentical to that of the first plurality of support-boundoligonucleotides; and c. a solution of reagents for synthesizingconstruction oligonucleotides in a chain extension reaction using thesupport-bound oligonucleotides as template.
 72. The system of claim 71further comprising a Nth plurality of support-bound oligonucleotidescomprising a terminal sequence complementary to that of a (N-1)thplurality of support-bound oligonucleotides, wherein N>=3.
 73. Thesystem of claim 72 wherein the first to the Nth pluralities ofsupport-bound oligonucleotides together comprise a target nucleic acidhaving a predetermined sequence.
 74. The system of claim 73 wherein eachsolid support comprises a plurality of features for assembling two ormore target nucleic acids in parallel.
 75. The system of claim 71wherein the solution comprises a plurality of microvolumes eachlocalized on a feature.
 76. The system of claim 71 further comprising afluid medium for transferring the construction oligonucleotides to theanchor support.
 77. The system of claim 71 wherein the support-boundoligonucleotides are synthesized or spotted on the one or more solidsupports.
 78. A method for producing at least one target nucleic acid,comprising: a. providing the nucleic acid array system of claim 42; b.synthesizing a first to Nth construction oligonucleotides in a chainextension reaction using the support-bound oligonucleotides as template;c. transferring the first to Nth construction oligonucleotides to thefirst plurality of anchor oligonucleotides; and d. hybridizing the firstto Nth construction oligonucleotides with the first plurality of anchoroligonucleotides, thereby producing at least one target nucleic acid.79. A method for producing at least one target nucleic acid, comprising:a. providing the nucleic acid array system of claim 71; b. synthesizinga first and second construction oligonucleotides using the support-boundoligonucleotides as template; c. transferring the first and secondconstruction oligonucleotides to the anchor support; and d. hybridizingthe first and second construction oligonucleotides with the anchoroligonucleotides, thereby producing at least one target nucleic acid.80. A method for producing at least one target nucleic acid, comprising:a. providing the nucleic acid array system of claim 73; b. synthesizinga first to a Nth construction oligonucleotides using the support-boundoligonucleotides as template; c. transferring the first to the Nthconstruction oligonucleotides to the anchor support; and d. hybridizingthe first to the Nth construction oligonucleotides with the anchoroligonucleotides, thereby producing at least one target nucleic acid.